Re: [Bug-wget] flock is not available on solaris 10 (at least sparc)

2015-12-12 Thread Tim Rühsen
You might be right, thanks for pointing out (I never use tarballs, that's why 
I don't have them in mind). 

Giuseppe created the tarballs on a system with flock() available. AFAIK, 
gnulib jumps in only when needed. Maybe the gnulib-flock implementation didn't 
find it's into the tarball ? We have to check that.

Tim

Am Samstag, 12. Dezember 2015, 18:21:43 schrieb Darshit Shah:
> If he's talking about releases, he is probably using the release
> tarballs. Those don't need ./bootstrap
> 
> On 12 December 2015 at 18:15, Tim Rühsen <tim.rueh...@gmx.de> wrote:
> > Hi Christian,
> > 
> > did you ./bootstrap, ./configure, make clean, make after updating ?
> > 
> > Especially without ./bootstrap flock might not be taken from gnulib (which
> > call lockf if flock is not available).
> > 
> > Regards, Tim
> > 
> > Am Samstag, 12. Dezember 2015, 17:42:10 schrieb Christian Jullien:
> >> Hi, trying to compile wget-1.17 and now wget-1.17.1 on solaris 10 sparc
> >> using gcc  5.2.0 , I get:
> >> 
> >> 
> >> 
> >> hsts.c:505:11: warning: implicit declaration of function 'flock'
> >> [-Wimplicit-function-declaration]
> >> 
> >>flock (fd, LOCK_EX);
> >>
> >>^
> >> 
> >> hsts.c:505:22: error: 'LOCK_EX' undeclared (first use in this function)
> >> 
> >>flock (fd, LOCK_EX);
> >>
> >>   ^
> >> 
> >> hsts.c:505:22: note: each undeclared identifier is reported only once for
> >> each function it appears in
> >> 
> >> Makefile:1573: recipe for target 'hsts.o' failed
> >> 
> >> 
> >> 
> >> I solved compilation issue by replacing flock (line 505) by lockf which
> >> is
> >> (POSIX.1-2001)
> >> 
> >> =>  lockf (fd, F_LOCK, 0);
> >> 
> >> 
> >> 
> >> Hope it helps.
> >> 
> >> 
> >> 
> >> Christian




Re: [Bug-wget] flock is not available on solaris 10 (at least sparc)

2015-12-12 Thread Tim Rühsen
Am Samstag, 12. Dezember 2015, 17:57:00 schrieb Darshit Shah:
> Didn't we fix this by using the Gnulib module?
> I thought all the tests were passing on Solaris after that. What went
> wrong with the release?
> Immediately after the release, even the Solaris buildbot complained
> about a broken build.

I thought that the buildbot runs on each push to git repository !?
If so, how can the last commit (the 1.17.1 tagged commit) cause a failure on 
did not come up before ? Bug in the gnulib module ?


The error does tell me nothing:
(This is from the buildbot output)

gmake[3]: Entering directory '/export/home/buildbot/slave/wget-solaris10-
i386/build/src'
/opt/csw/bin/gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\" -
DLOCALEDIR=\"/usr/local/share/locale\" -I.  -I../lib -I../lib -D_REENTRANT  -
I/opt/csw/include   -I/opt/csw/include -I/opt/csw/include   -I/opt/csw/include 
-I/opt/csw/include/p11-kit-1   -DHAVE_LIBGNUTLS -I/opt/csw/include   -
I/opt/csw/include   -DNDEBUG  -MT connect.o -MD -MP -MF .deps/connect.Tpo -c -
o connect.o connect.c
In file included from /usr/include/sys/types.h:17:0,
 from ../lib/sys/types.h:28,
 from sysdep.h:85,
 from wget.h:47,
 from connect.c:32:
/opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-
fixed/sys/feature_tests.h:346:2: error: #error "Compiler or options invalid 
for pre-UNIX 03 X/Open applications and pre-2001 POSIX applications"
 #error "Compiler or options invalid for pre-UNIX 03 X/Open applications \
  ^

Tim

> 
> On 12 December 2015 at 17:42, Christian Jullien  wrote:
> > Hi, trying to compile wget-1.17 and now wget-1.17.1 on solaris 10 sparc
> > using gcc  5.2.0 , I get:
> > 
> > 
> > 
> > hsts.c:505:11: warning: implicit declaration of function 'flock'
> > [-Wimplicit-function-declaration]
> > 
> >flock (fd, LOCK_EX);
> >
> >^
> > 
> > hsts.c:505:22: error: 'LOCK_EX' undeclared (first use in this function)
> > 
> >flock (fd, LOCK_EX);
> >
> >   ^
> > 
> > hsts.c:505:22: note: each undeclared identifier is reported only once for
> > each function it appears in
> > 
> > Makefile:1573: recipe for target 'hsts.o' failed
> > 
> > 
> > 
> > I solved compilation issue by replacing flock (line 505) by lockf which is
> > (POSIX.1-2001)
> > 
> > =>  lockf (fd, F_LOCK, 0);
> > 
> > 
> > 
> > Hope it helps.
> > 
> > 
> > 
> > Christian




Re: [Bug-wget] flock is not available on solaris 10 (at least sparc)

2015-12-12 Thread Tim Rühsen
Hi Christian,

did you ./bootstrap, ./configure, make clean, make after updating ?

Especially without ./bootstrap flock might not be taken from gnulib (which 
call lockf if flock is not available).

Regards, Tim

Am Samstag, 12. Dezember 2015, 17:42:10 schrieb Christian Jullien:
> Hi, trying to compile wget-1.17 and now wget-1.17.1 on solaris 10 sparc
> using gcc  5.2.0 , I get:
> 
> 
> 
> hsts.c:505:11: warning: implicit declaration of function 'flock'
> [-Wimplicit-function-declaration]
> 
>flock (fd, LOCK_EX);
> 
>^
> 
> hsts.c:505:22: error: 'LOCK_EX' undeclared (first use in this function)
> 
>flock (fd, LOCK_EX);
> 
>   ^
> 
> hsts.c:505:22: note: each undeclared identifier is reported only once for
> each function it appears in
> 
> Makefile:1573: recipe for target 'hsts.o' failed
> 
> 
> 
> I solved compilation issue by replacing flock (line 505) by lockf which is
> (POSIX.1-2001)
> 
> =>  lockf (fd, F_LOCK, 0);
> 
> 
> 
> Hope it helps.
> 
> 
> 
> Christian




Re: [Bug-wget] flock is not available on solaris 10 (at least sparc)

2015-12-12 Thread Tim Rühsen
It is just the compiler/compiler flags that the buildbot uses.

CC and CFLAGS not set and the error is reproducible.
With CC=gcc and CFLAGS="-std=c89 -Wall -O2" the build runs fine.

I'll make some more tests to find the exact point. I guess Dagobert might 
adjust some settings for the buildbot.

Tim

Am Samstag, 12. Dezember 2015, 18:52:57 schrieb Darshit Shah:
> Exactly! But I checked the gnulib changelog. Nothing major seems to
> have changed that causes such a fault.
> Most of the commits seem to be documentation related only.
> 
> Also, like you said, the compiler message is not helpful at all. Need
> someone who is more experienced in such systems to take alook
> 
> On 12 December 2015 at 18:49, Tim Rühsen <tim.rueh...@gmx.de> wrote:
> > Am Samstag, 12. Dezember 2015, 17:57:00 schrieb Darshit Shah:
> >> Didn't we fix this by using the Gnulib module?
> >> I thought all the tests were passing on Solaris after that. What went
> >> wrong with the release?
> >> Immediately after the release, even the Solaris buildbot complained
> >> about a broken build.
> > 
> > I thought that the buildbot runs on each push to git repository !?
> > If so, how can the last commit (the 1.17.1 tagged commit) cause a failure
> > on did not come up before ? Bug in the gnulib module ?
> > 
> > 
> > The error does tell me nothing:
> > (This is from the buildbot output)
> > 
> > gmake[3]: Entering directory '/export/home/buildbot/slave/wget-solaris10-
> > i386/build/src'
> > /opt/csw/bin/gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\"
> > - DLOCALEDIR=\"/usr/local/share/locale\" -I.  -I../lib -I../lib
> > -D_REENTRANT  - I/opt/csw/include   -I/opt/csw/include -I/opt/csw/include
> >   -I/opt/csw/include -I/opt/csw/include/p11-kit-1   -DHAVE_LIBGNUTLS
> > -I/opt/csw/include   - I/opt/csw/include   -DNDEBUG  -MT connect.o -MD
> > -MP -MF .deps/connect.Tpo -c - o connect.o connect.c
> > In file included from /usr/include/sys/types.h:17:0,
> > 
> >  from ../lib/sys/types.h:28,
> >  from sysdep.h:85,
> >  from wget.h:47,
> > 
> >  from connect.c:32:
> > /opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-
> > fixed/sys/feature_tests.h:346:2: error: #error "Compiler or options
> > invalid
> > for pre-UNIX 03 X/Open applications and pre-2001 POSIX applications"
> > 
> >  #error "Compiler or options invalid for pre-UNIX 03 X/Open applications \
> >  
> >   ^
> > 
> > Tim
> > 
> >> On 12 December 2015 at 17:42, Christian Jullien <eli...@orange.fr> wrote:
> >> > Hi, trying to compile wget-1.17 and now wget-1.17.1 on solaris 10 sparc
> >> > using gcc  5.2.0 , I get:
> >> > 
> >> > 
> >> > 
> >> > hsts.c:505:11: warning: implicit declaration of function 'flock'
> >> > [-Wimplicit-function-declaration]
> >> > 
> >> >flock (fd, LOCK_EX);
> >> >
> >> >^
> >> > 
> >> > hsts.c:505:22: error: 'LOCK_EX' undeclared (first use in this function)
> >> > 
> >> >flock (fd, LOCK_EX);
> >> >
> >> >   ^
> >> > 
> >> > hsts.c:505:22: note: each undeclared identifier is reported only once
> >> > for
> >> > each function it appears in
> >> > 
> >> > Makefile:1573: recipe for target 'hsts.o' failed
> >> > 
> >> > 
> >> > 
> >> > I solved compilation issue by replacing flock (line 505) by lockf which
> >> > is
> >> > (POSIX.1-2001)
> >> > 
> >> > =>  lockf (fd, F_LOCK, 0);
> >> > 
> >> > 
> >> > 
> >> > Hope it helps.
> >> > 
> >> > 
> >> > 
> >> > Christian



Re: [Bug-wget] flock is not available on solaris 10 (at least sparc)

2015-12-12 Thread Tim Rühsen
Just for the record:

Wget from tarball compiles on Solaris 10 x86 and Sparc (OpenCSW build farm) 
with CFLAGS="-std=c89" set (@Dagobert: Could you set this flag for the build 
bot ?).

./configure says:
checking for flock... no

So the fallback from gnulib jumps in and voila.

Tim

Am Samstag, 12. Dezember 2015, 19:15:17 schrieb Tim Rühsen:
> It is just the compiler/compiler flags that the buildbot uses.
> 
> CC and CFLAGS not set and the error is reproducible.
> With CC=gcc and CFLAGS="-std=c89 -Wall -O2" the build runs fine.
> 
> I'll make some more tests to find the exact point. I guess Dagobert might
> adjust some settings for the buildbot.
> 
> Tim
> 
> Am Samstag, 12. Dezember 2015, 18:52:57 schrieb Darshit Shah:
> > Exactly! But I checked the gnulib changelog. Nothing major seems to
> > have changed that causes such a fault.
> > Most of the commits seem to be documentation related only.
> > 
> > Also, like you said, the compiler message is not helpful at all. Need
> > someone who is more experienced in such systems to take alook
> > 
> > On 12 December 2015 at 18:49, Tim Rühsen <tim.rueh...@gmx.de> wrote:
> > > Am Samstag, 12. Dezember 2015, 17:57:00 schrieb Darshit Shah:
> > >> Didn't we fix this by using the Gnulib module?
> > >> I thought all the tests were passing on Solaris after that. What went
> > >> wrong with the release?
> > >> Immediately after the release, even the Solaris buildbot complained
> > >> about a broken build.
> > > 
> > > I thought that the buildbot runs on each push to git repository !?
> > > If so, how can the last commit (the 1.17.1 tagged commit) cause a
> > > failure
> > > on did not come up before ? Bug in the gnulib module ?
> > > 
> > > 
> > > The error does tell me nothing:
> > > (This is from the buildbot output)
> > > 
> > > gmake[3]: Entering directory
> > > '/export/home/buildbot/slave/wget-solaris10-
> > > i386/build/src'
> > > /opt/csw/bin/gcc -DHAVE_CONFIG_H
> > > -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\"
> > > - DLOCALEDIR=\"/usr/local/share/locale\" -I.  -I../lib -I../lib
> > > -D_REENTRANT  - I/opt/csw/include   -I/opt/csw/include
> > > -I/opt/csw/include
> > > 
> > >   -I/opt/csw/include -I/opt/csw/include/p11-kit-1   -DHAVE_LIBGNUTLS
> > > 
> > > -I/opt/csw/include   - I/opt/csw/include   -DNDEBUG  -MT connect.o -MD
> > > -MP -MF .deps/connect.Tpo -c - o connect.o connect.c
> > > In file included from /usr/include/sys/types.h:17:0,
> > > 
> > >  from ../lib/sys/types.h:28,
> > >  from sysdep.h:85,
> > >  from wget.h:47,
> > > 
> > >  from connect.c:32:
> > > /opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-
> > > fixed/sys/feature_tests.h:346:2: error: #error "Compiler or options
> > > invalid
> > > for pre-UNIX 03 X/Open applications and pre-2001 POSIX applications"
> > > 
> > >  #error "Compiler or options invalid for pre-UNIX 03 X/Open applications
> > >  \
> > >  
> > >   ^
> > > 
> > > Tim
> > > 
> > >> On 12 December 2015 at 17:42, Christian Jullien <eli...@orange.fr> 
wrote:
> > >> > Hi, trying to compile wget-1.17 and now wget-1.17.1 on solaris 10
> > >> > sparc
> > >> > using gcc  5.2.0 , I get:
> > >> > 
> > >> > 
> > >> > 
> > >> > hsts.c:505:11: warning: implicit declaration of function 'flock'
> > >> > [-Wimplicit-function-declaration]
> > >> > 
> > >> >flock (fd, LOCK_EX);
> > >> >
> > >> >^
> > >> > 
> > >> > hsts.c:505:22: error: 'LOCK_EX' undeclared (first use in this
> > >> > function)
> > >> > 
> > >> >flock (fd, LOCK_EX);
> > >> >
> > >> >   ^
> > >> > 
> > >> > hsts.c:505:22: note: each undeclared identifier is reported only once
> > >> > for
> > >> > each function it appears in
> > >> > 
> > >> > Makefile:1573: recipe for target 'hsts.o' failed
> > >> > 
> > >> > 
> > >> > 
> > >> > I solved compilation issue by replacing flock (line 505) by lockf
> > >> > which
> > >> > is
> > >> > (POSIX.1-2001)
> > >> > 
> > >> > =>  lockf (fd, F_LOCK, 0);
> > >> > 
> > >> > 
> > >> > 
> > >> > Hope it helps.
> > >> > 
> > >> > 
> > >> > 
> > >> > Christian




Re: [Bug-wget] flock is not available on solaris 10 (at least sparc)

2015-12-12 Thread Tim Rühsen
Am Samstag, 12. Dezember 2015, 20:23:17 schrieb Christian Jullien:
> => Ok, _XPG6 is now required, 3rd attempt
> $ CFLAGS="-D_XPG6" PKG_CONFIG_PATH=/usr/local/lib/pkgconfig/ ./configure;
> make

Could you try -std=c89 instead of -D_XPG6 ?

Tim




Re: [Bug-wget] Can't build wget with GnuTLS on Mac OS X

2015-12-06 Thread Tim Rühsen
Am Sonntag, 6. Dezember 2015, 12:48:45 schrieb 桃源老師:
> Hello, Tim-san,
> 
> > 2015/12/06 7:47 A.M. Tim Rühsen <tim.rueh...@gmx.de> wrote:
> > 
> > Am Sonntag, 6. Dezember 2015, 01:29:17 schrieb 桃源老師:
> >> my configure statement:
> >> $ export TARGET ="/Volumes/ffmpeg_compile"
> >> $ LDFLAGS=-L${TARGET}/lib LIBS=-lgmp  ./configure --prefix=${TAREGT}
> > 
> > Is this correct ?
> > --prefix=${TAREGT}
> > 
> > Maybe TARGET ?
> 
> Sorry it's my typo error...
> 
> > "_wrap_nettle_pk_generate_params in libgnutls.a(pk.o)"
> > You statically link  GnuTLS ? Try it dynamically, else you need libnettle
> > linked *after* libgnutls.
> 
> Well, now I can build wget with dynamically linked GnuTLS.
> 
> But if I can, I'd like to have a statically linked one.  You mentioned that
> "need libnettle linked after libgnutls", but I don't know how should I.
> Could you please provide the way to build statically linked binary of wget?

IMO, it's a bad idea... (and I have not much experience with that), please 
read http://www.akkadia.org/drepper/no_static_linking.html and reconsider 
before reading on.


If wget links dynamically, you can 'ldd wget' to see all the libraries need 
for static linking. You have to figure out the order on the command line 
yourself. Gcc option -static is helpful (tells the linker to use static .a 
libraries instead of dynamic .so - if static ones are available).

Just modify/extend your original command line:
gcc  -I/Volumes/ffmpeg_compile/include -DHAVE_LIBGNUTLS -
I/Volumes/ffmpeg_compile/include -DNDEBUG   -L/Volumes/ffmpeg_compile/lib -o 
wget connect.o convert.o cookies.o ftp.o css_.o css-url.o ftp-basic.o ftp-ls.o 
hash.o host.o hsts.o html-parse.o html-url.o http.o init.o log.o main.o 
netrc.o progress.o ptimer.o recur.o res.o retr.o spider.o url.o warc.o utils.o 
exits.o build_info.o   version.o ftp-opie.o gnutls.o http-ntlm.o 
../lib/libgnu.a /Volumes/ffmpeg_compile/lib/libiconv.a  -lnettle -
L/Volumes/ffmpeg_compile/lib -lgnutls -L/Volumes/ffmpeg_compile/lib -lz -lgmp

gnutls is using nettle functions, so here the order is wrong. Link nettle 
after gnutls, followed by gmp. 

Be warned that your executable might be huge after linking.

Also, it might be that you need a different C standard library for static 
linking, just search the net for detailed infos.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Can't build wget with GnuTLS on Mac OS X

2015-12-05 Thread Tim Rühsen
Am Sonntag, 6. Dezember 2015, 01:29:17 schrieb 桃源老師:
> Hello,
> 
> I'm trying to build wget with GnuTLS on Mac OS X.
> 
> Version Info:
> Mac OS X: 10.11.1 El Capitan
> Xcode:   7.1.1
> GMP: 6.1.0
> Nettle:   3.1.1
> GnuTLS:3.4.7
> 
> Since I can build rtmpdump and ffmpeg with same GnuTLS library, I am
> currently thinking install of GnuTLS might be succeeded. But when I try to
> compile wget with GnuTLS, I get make error.
> 
> my configure statement:
> $ export TARGET ="/Volumes/ffmpeg_compile"
> $ LDFLAGS=-L${TARGET}/lib LIBS=-lgmp  ./configure --prefix=${TAREGT}

Is this correct ?
--prefix=${TAREGT} 

Maybe TARGET ?


"_wrap_nettle_pk_generate_params in libgnutls.a(pk.o)"
You statically link  GnuTLS ? Try it dynamically, else you need libnettle 
linked *after* libgnutls.


> 
> The make error:
> gcc  -I/Volumes/ffmpeg_compile/include -DHAVE_LIBGNUTLS
> -I/Volumes/ffmpeg_compile/include -DNDEBUG   -L/Volumes/ffmpeg_compile/lib
> -o wget connect.o convert.o cookies.o ftp.o css_.o css-url.o ftp-basic.o
> ftp-ls.o hash.o host.o hsts.o html-parse.o html-url.o http.o init.o log.o
> main.o netrc.o progress.o ptimer.o recur.o res.o retr.o spider.o url.o
> warc.o utils.o exits.o build_info.o   version.o ftp-opie.o gnutls.o
> http-ntlm.o ../lib/libgnu.a /Volumes/ffmpeg_compile/lib/libiconv.a 
> -lnettle -L/Volumes/ffmpeg_compile/lib -lgnutls
> -L/Volumes/ffmpeg_compile/lib -lz -lgmp Undefined symbols for architecture
> x86_64:
>   "_gnutls_protocol_set_priority", referenced from:
>   _ssl_connect_wget in gnutls.o
>   "_nettle_dsa_generate_params", referenced from:
>   _wrap_nettle_pk_generate_params in libgnutls.a(pk.o)
>   "_nettle_dsa_params_clear", referenced from:
>   _wrap_nettle_pk_generate_params in libgnutls.a(pk.o)
>   "_nettle_dsa_params_init", referenced from:
>   _wrap_nettle_pk_generate_params in libgnutls.a(pk.o)
>   "_nettle_dsa_sign", referenced from:
>   __wrap_nettle_pk_sign in libgnutls.a(pk.o)
>   "_nettle_dsa_signature_clear", referenced from:
>   __wrap_nettle_pk_sign in libgnutls.a(pk.o)
>   "_nettle_dsa_signature_init", referenced from:
>   __wrap_nettle_pk_sign in libgnutls.a(pk.o)
>   "_nettle_dsa_verify", referenced from:
>   __wrap_nettle_pk_verify in libgnutls.a(pk.o)
>   "_nettle_ecc_point_clear", referenced from:
>   __wrap_nettle_pk_verify in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_pub_params in libgnutls.a(pk.o)
>   _wrap_nettle_pk_generate_keys in libgnutls.a(pk.o)
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_point_get", referenced from:
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   _wrap_nettle_pk_generate_keys in libgnutls.a(pk.o)
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_point_init", referenced from:
>   __wrap_nettle_pk_verify in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_pub_params in libgnutls.a(pk.o)
>   _wrap_nettle_pk_generate_keys in libgnutls.a(pk.o)
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_point_mul", referenced from:
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_point_mul_g", referenced from:
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   "_nettle_ecc_point_set", referenced from:
>   __wrap_nettle_pk_verify in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_pub_params in libgnutls.a(pk.o)
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_scalar_clear", referenced from:
>   __wrap_nettle_pk_sign in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   _wrap_nettle_pk_generate_keys in libgnutls.a(pk.o)
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_scalar_get", referenced from:
>   _wrap_nettle_pk_generate_keys in libgnutls.a(pk.o)
>   "_nettle_ecc_scalar_init", referenced from:
>   __wrap_nettle_pk_sign in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   _wrap_nettle_pk_generate_keys in libgnutls.a(pk.o)
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_scalar_set", referenced from:
>   __wrap_nettle_pk_sign in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_size", referenced from:
>   __wrap_nettle_pk_sign in libgnutls.a(pk.o)
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   __wrap_nettle_pk_derive in libgnutls.a(pk.o)
>   "_nettle_ecc_size_a", referenced from:
>   _wrap_nettle_pk_verify_priv_params in libgnutls.a(pk.o)
>   "_nettle_ecdsa_generate_keypair", referenced from:
>   _wrap_nettle_pk_generate_keys in libgnutls.a(pk.o)
>   

Re: [Bug-wget] --no-check-cert does not avoid cert warning

2015-12-01 Thread Tim Rühsen
Am Dienstag, 1. Dezember 2015, 21:48:52 schrieb Ander Juaristi:
> On 12/01/2015 09:31 AM, Tim Ruehsen wrote:
> > Are you working on *nix ?
> > 
> > Try wget ... |& grep -v "WARNING: cannot verify"
> > 
> > To filter out the warnings you don't want to see. You could use egrep to
> > filter different lines at once.
> 
> That's a good idea but IMO it's a bit hackish. What's more, I think `|&'
> it's not a *nix feature, but rather a Bash feature.
> 
> It's basically `2>&1 |' behind the scenes, but worth pointing it out, anyway
> [1].
> 
> [1] http://www.gnu.org/software/bash/manual/bashref.html#Pipelines

You are right, it's a bashism. Doesn't work with dash.

But what works also with dash is

#!/bin/dash
{ wget -d xxx 2>&1 1>&3 | grep -v Saving 1>&2; } 3>&1

Found it here:
http://unix.stackexchange.com/questions/3514/how-to-grep-standard-error-stream-stderr

Tim




Re: [Bug-wget] --no-check-cert does not avoid cert warning

2015-11-30 Thread Tim Rühsen
There is the situation where --no-check-cert is implicitly set (.wgetrc, 
/etc/wgetrc, alias) and the user isn't aware of it. Just downloading without a 
warning opens a huge security hole because you can't verify where you 
downloaded it from (DNS attacks, MITM).
I leave it to your imagination what could happen to people in unsafe 
countries... this warning could save lives.

For an expert like Karl, this is just annoying.

The warning text could be worked on, makeing clear that you are really leaving 
secure ground, that cert checking has been explicitly turned off and how to 
turn it on again. And only proceed if you really, really are aware of what you 
are doing.

Of course all this applies to HTTP (plain text) as well. But someone 
requesting HTTPS and than dropping the gained security should be warned by 
default.

My thinking is a pessimistic approach, but as long as you can't be 100% sure 
that bad things can't happend due to dropping the warning, we should leave it 
(and improve it the best we can).

Tim


Am Montag, 30. November 2015, 15:27:08 schrieb Giuseppe Scrivano:
> Hi Karl,
> 
> Karl Berry  writes:
> > With wget 1.17 (at least),
> > 
> > $ wget -nv --no-check-cert https://www.gnu.org -O /dev/null
> > 
> > WARNING: cannot verify www.gnu.org's certificate, issued by 'CN=Gandi 
Standard SSL CA 2,O=Gandi,L=Paris,ST=Paris,C=FR':
> >   Unable to locally verify the issuer's authority.
> > 
> > Maybe I'm crazy, but it seems like pointless noise to complain that a
> > certificate cannot be verified when wget has been explicitly told not to
> > check it.  Looking at the source, the only way I see to get rid of the
> > warning is with --silent, which would also eliminate real errors.
> 
> the only difference with --no-check-cert is that wget will fail and exit
> immediately when the certificate is not valid.  The idea behind
> --no-check-cert was probably to not abort the execution of wget but
> still inform the user about an invalid certificate, as the documentation
> says:
> 
>   This option forces an ``insecure'' mode of
>   operation that turns the certificate verification errors into warnings
>   and allows you to proceed.
> 
> I am personally in favor of dropping the warning, as it is doing
> something the user asked to not do.
> 
> Anybody has something against this patch?
> 
> Regards,
> Giuseppe
> 
> diff --git a/doc/wget.texi b/doc/wget.texi
> index c647e33..6aeda72 100644
> --- a/doc/wget.texi
> +++ b/doc/wget.texi
> @@ -1714,9 +1714,7 @@ handshake and aborting the download if the
> verification fails. Although this provides more secure downloads, it does
> break
>  interoperability with some sites that worked with previous Wget
>  versions, particularly those using self-signed, expired, or otherwise
> -invalid certificates.  This option forces an ``insecure'' mode of
> -operation that turns the certificate verification errors into warnings
> -and allows you to proceed.
> +invalid certificates.
> 
>  If you encounter ``certificate verification'' errors or ones saying
>  that ``common name doesn't match requested host name'', you can use
> diff --git a/src/gnutls.c b/src/gnutls.c
> index d1444fe..b48e4e8 100644
> --- a/src/gnutls.c
> +++ b/src/gnutls.c
> @@ -686,12 +686,13 @@ ssl_check_certificate (int fd, const char *host)
> 
>unsigned int status;
>int err;
> -
> -  /* If the user has specified --no-check-cert, we still want to warn
> - him about problems with the server's certificate.  */
> -  const char *severity = opt.check_cert ? _("ERROR") : _("WARNING");
> +  const char *severity = _("ERROR");
>bool success = true;
> 
> +  /* The user explicitly said to not check for the certificate.  */
> +  if (!opt.check_cert)
> +return success;
> +
>err = gnutls_certificate_verify_peers2 (ctx->session, );
>if (err < 0)
>  {
> @@ -766,5 +767,5 @@ ssl_check_certificate (int fd, const char *host)
>  }
> 
>   out:
> -  return opt.check_cert ? success : true;
> +  return success;
>  }
> diff --git a/src/openssl.c b/src/openssl.c
> index 4876048..f5fe675 100644
> --- a/src/openssl.c
> +++ b/src/openssl.c
> @@ -673,15 +673,15 @@ ssl_check_certificate (int fd, const char *host)
>long vresult;
>bool success = true;
>bool alt_name_checked = false;
> -
> -  /* If the user has specified --no-check-cert, we still want to warn
> - him about problems with the server's certificate.  */
> -  const char *severity = opt.check_cert ? _("ERROR") : _("WARNING");
> -
> +  const char *severity = _("ERROR");
>struct openssl_transport_context *ctx = fd_transport_context (fd);
>SSL *conn = ctx->conn;
>assert (conn != NULL);
> 
> +  /* The user explicitly said to not check for the certificate.  */
> +  if (!opt.check_cert)
> +return success;
> +
>cert = SSL_get_peer_certificate (conn);
>if (!cert)
>  {
> @@ -885,8 +885,7 @@ ssl_check_certificate (int fd, const char *host)
>  To connect to %s insecurely, use 

Re: [Bug-wget] wget 1.17 segfaults under openSUSE 13.1 (x86_64; glibc 2.18)

2015-11-21 Thread Tim Rühsen
Hi,

here is patch to avoid the crash, please review.

@Darshit As you say, we probably should discuss the program logic in
http_loop() regarding if-modified-since !?

Regards, Tim


Am Samstag, 21. November 2015, 00:10:44 schrieb Darshit Shah:
> Another thing that I just remembered, this issue seems to pop up when the
> file being downloaded already exists on disk. Maybe, that is why you're
> seeing the different behaviour?
>
> Try downloading the file when it already exists and see if the problem can
> be reproduced on the newer system.
>
> On 11/20, Darshit Shah wrote:
> >This looks similar to another segfault I've seen. I'm not sure since
> >when it exists in the code, but I did come across one recently. That
> >case was caused when --trust-server-names -N and --content-disposition
> >were all provided to Wget. A wrong logic condition causes Wget to work
> >on output_filename = NULL which eventually resents in a segfault.
> >
> >I'm assuming the case you've provided is very similar to what I've
> >seen. I was using Clang when I saw this issue. Unfortunately, I'm
> >currently swamped with other work and am unable to look more deeply
> >into this.
> >
> >The stack trace provided does match my explanation above, so we should
> >be able to track it down and fix it. Some interplay of multiple
> >options is causing this bug, though I can't explain why it works with
> >one GCC version and not with another. The way I see it, it was a logic
> >bug in the code.
> >
> >Unfortunately, On 11/20, Schleusener, Jens wrote:
> >>Hi,
> >>
> >>under some conditions I get with a self-compiled wget 1.17 binary on
> >>a 64-bit openSUSE 13.1 Linux system a segmentation fault (but the
> >>self-compiled wget 1.16.3 works correctly).
> >>
> >>I could reduce the problem to this usage case:
> >>
> >>wget -N --content-disposition http://ftp.gnu.org/gnu/wget/
> >>
> >>while the usage of only "-N" or "--content-disposition" let wget work.
> >>
> >>Sorry, I am not a C expert but nevertheless I tried to use gdb on
> >>the resulting core dump as best I could with the following result:
> >>
> >>Program terminated with signal SIGSEGV, Segmentation fault.
> >>#0  0x7f5b0899d42a in strlen () from /lib64/libc.so.6
> >>(gdb) bt
> >>#0  0x7f5b0899d42a in strlen () from /lib64/libc.so.6
> >>#1  0x00420cdf in set_file_timestamp ()
> >>#2  0x004241e0 in http_loop ()
> >>#3  0x00433619 in retrieve_url ()
> >>#4  0x0042c64b in main ()
> >>
> >>The used gcc version is
> >>4.8.1
> >>and a "rpm -qf /lib64/libc.so.6" issues
> >>glibc-2.18-4.38.1.x86_64
> >>
> >>On a newer openSUSE Leap 42.1 Linux system the "identically"
> >>compiled wget doesn't segfault and works ok. On that system the gcc
> >>version is
> >>4.8.5
> >>and a "rpm -qf /lib64/libc.so.6" issues
> >>glibc-2.19-17.4.x86_64
> >>
> >>Regards
> >>
> >>Jens
From 0eacdbfc1b3341fce2fcb8239dcd81d6dd3969f9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tim Rühsen?= <tim.rueh...@gmx.de>
Date: Sat, 21 Nov 2015 21:44:11 +0100
Subject: [PATCH] Fix SIGSEGV in -N / --content-disposition combination

* src/http.c (http_loop): Fix SIGSEGV

Reported-by: "Schleusener, Jens" <jens.schleuse...@t-online.de>
---
 src/http.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/src/http.c b/src/http.c
index 355ff53..9d71483 100644
--- a/src/http.c
+++ b/src/http.c
@@ -3794,7 +3794,6 @@ http_loop (struct url *u, struct url *original_url, char **newloc,
   struct http_stat hstat;/* HTTP status */
   struct_stat st;
   bool send_head_first = true;
-  char *file_name;
   bool force_full_retrieve = false;


@@ -3864,11 +3863,6 @@ http_loop (struct url *u, struct url *original_url, char **newloc,
   if (opt.content_disposition && opt.always_rest)
 send_head_first = true;

-  if (!opt.output_document)
-  file_name = url_file_name (opt.trustservernames ? u : original_url, NULL);
-  else
-file_name = xstrdup (opt.output_document);
-
 #ifdef HAVE_METALINK
   if (opt.metalink_over_http)
 {
@@ -3881,7 +3875,7 @@ http_loop (struct url *u, struct url *original_url, char **newloc,
 {
   /* Use conditional get request if requested
* and if timestamp is known at this moment.  */
-  if (opt.if_modified_since && file_exists_p (file_name) && !send_head_first)
+  if (opt.if_modified_since && !send_head_first && got_name && file_exi

Re: [Bug-wget] Wget 1.17 doesn't compile on Windows (hsts.c)

2015-11-17 Thread Tim Rühsen
Hi Dagobert,


Am Dienstag, 17. November 2015, 15:09:05 schrieb Dagobert Michelsen:
> > RUNNING TEST test_hsts_url_rewrite_superdomain...
> > A new entry should've been created
> > Tests run: 15
> > FAIL unit-tests (exit status: 1)
> 
> So which test is failing? test_hsts_url_rewrite_superdomain ?
> There is no *hsts* in tests/
> 

Could you please change L689 of src/hsts.c from
  created = hsts_store_entry (s, SCHEME_HTTPS, "www.foo.com", 443, time(NULL) 
+ 1234, true);

to
  created = hsts_store_entry (s, SCHEME_HTTPS, "www.foo.com", 443, 1234, 
true);

and give it a try ?

Regards, Tim



Re: [Bug-wget] Travis-CI updates and minor bug fixes

2015-10-11 Thread Tim Rühsen
Hi Darshit,

nice to see your Travis-CI patches !

I am not sure, but maybe an older version of valgrind hits you:
http://stackoverflow.com/questions/12708501/valgrind-breaks-with-dirname
(They say memrchr should work with valgrind 3.8+... but maybe just from 3.8.1 
on ?)
I guess a suppression to silence valgrind makes sense.


You have
+before_script:
+- export CFLAGS="-O2 -Wall -Wextra"

but the script sets CFLAGS on its own... I am not sure how these interfere 
and/or influence 'make check|distcheck'.

Your patches look fine to me, looking forward for the push.

Regards, Tim


Am Samstag, 10. Oktober 2015, 21:50:07 schrieb Darshit Shah:
> I've recently been working on setting up an automated build and test
> of the Wget source on Travis-CI. This threw a couple of issues that
> I've since debugged.
> 
> As of now, there is an automated build and test of Wget on every
> commit pushed to the "travis" branch on my personal repository of Wget
> on Github[1]. I have attached two patches here which came to light due
> to the Travis tests.
> 1. The valgrind suppressions file for SSL tests was not included in
> the MAKE_DIST variable causing distcheck to fail.
> 2. Fix wrong logic in Test-ftp-pasv-no-support.px.
> 
> Also, I've attached a patch for including the .travis.yml file and a
> script for compiling and running tests on the Travis container. If
> someone here has experience setting up tests on Travis, kindly to
> review the patch to see if something can be improved.
> 
> Currently, the tests on Travis fail since the valgrind tests are
> failing. This is interesting because on my Arch Linux machine with
> valgrind 3.11, all the tests pass. But on the Ubuntu Precise container
> with Valgrind 3.8, 6 of the Perl tests fail due to "An unconditional
> jump based on uninitialized values". I can reproduce these on a
> virtual machine running Ubuntu Precise, but am unable to do so on my
> local Arch Linux machine. The log file for the latest run as of this
> email is:
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/84697418/log.txt
> 
> If this looks good, I'd like to set up Travis to run against a new
> commit push to Wget's repository on Savannah and add a couple more
> test cases.
> 
> 
> [1]: https://github.com/darnir/wget




Re: [Bug-wget] Travis-CI updates and minor bug fixes

2015-10-11 Thread Tim Rühsen
Am Sonntag, 11. Oktober 2015, 12:32:14 schrieb Darshit Shah:
> However, I did come across a small problem. Travis currently accepts
> open source build requests from GitHub only. So I don't think we can
> use it from Savannah. Would it be a problem if we pushed the
> .travis.yml to master and allow the builds to be fired from one of our
> GitHub clones?

Hi Darshit,

IMO there is no problem.

If there is no possibility to trigger a build from Savannah, we have to use 
Github. Or we have to trigger it manually...

That is something that other people already asked for:
http://kamranicus.com/blog/2015/03/29/triggering-a-travis-build-programmatically/

Maybe you could give this a try (maybe using wget instead of curl ?).
http://docs.travis-ci.com/user/triggering-builds/

Regards, Tim




Re: [Bug-wget] possible fix to a "make check" failure...

2015-10-10 Thread Tim Rühsen
Am Freitag, 9. Oktober 2015, 04:00:35 schrieb christian fafard:
> 2 tests fail when the "IO::Socket::SSL" perl module is not
> installed,Test-proxied-https-auth-keepalive.px and
> Test-proxied-https-auth.px. In my opinion, it falsly suggests a problem
> with wget's build.I propose instead to test first if the module is
> installed, and skip these tests if not. I attach a patch to do so.
> ThanksChristian Fafard

Hi Christian,

thanks for your contribution, IMO it is a good idea.

Could you amend your patch to print out an appropriate message, so that 
looking at the log file tells me what to do !?

And if possible without too much trouble for you, could you send us the patch 
generated with git format-patch ? (Don't mind, if you don't work with git.)

Regards, Tim




Re: [Bug-wget] [bug #46161] make check fails due to missing library in tests/Makefile link command

2015-10-10 Thread Tim Rühsen
Hi Kevin,

I can't reproduce the wget compilation problem.

Steps I did:
- download http://ftp.gnu.org/gnu/wget/wget-1.16.3.tar.gz
- tar xvfz wget-1.16.3.tar.gz
- cd wget-1.16.3/
- ./configure
- make
- make check

You should really give us all details on how to reproduce your problem with 
tests/Makefile.in. Also we need more information about your OS and the 
versions of software you use (see README.checkout for what is relevant to wget 
compilation).

We won't (and we simply can't) support all combinations of library/header 
versions of openssl. In fact, you should really sync library and header 
version being the same, else you maybe run into obscure problems with self-
compiled software later (a perfect method to achieve an unstable system).

Regards, Tim

Am Freitag, 9. Oktober 2015, 14:30:26 schrieb Rodgers, Kevin:
> Hi Tim,
> 
> Yes, I realized after I submitted my report that tests/Makefile is
> generated.  But Makefile.in is included in the source distribution, so that
> is a better file to patch (1 step upstream of tests/Makefile, but still 1
> step downstream of tests/Makefile.am).
 
> I did download and build wget 1.16.3 and found that it has the same problem.
>  In addition, I had to hack openssl.c to deal with the fact that the old
> server I'm working on has libssl.so.0.9.8 installed but its include files
> are from 0.9.7a (Open SSL):
 
> *** src/openssl.c.orig  2015-02-10 14:23:49.0 -0700
> --- src/openssl.c   2015-10-08 15:41:23.0 -0600
> ***
> *** 198,203 
> --- 198,206 
>   #if OPENSSL_VERSION_NUMBER >= 0x00907000
> OPENSSL_load_builtin_modules();
> ENGINE_load_builtin_engines();
> +   #ifndef CONF_MFLAGS_DEFAULT_SECTION
> +   #define CONF_MFLAGS_DEFAULT_SECTION 0x20
> +   #endif
> CONF_modules_load_file(NULL, NULL,
> CONF_MFLAGS_DEFAULT_SECTION|CONF_MFLAGS_IGNORE_MISSING_FILE);
>   #endif
> 
> Thanks,
> 
> Kevin Rodgers 
> Principal Software Engineer
> Product Development & Design
> Tel: 303-397-2807
> IHS: 710-2807
> 
> 
> 
> -Original Message-
> From: Tim Ruehsen [mailto:invalid.nore...@gnu.org] 
> Sent: Friday, October 09, 2015 2:10 AM
> To: Tim Ruehsen; Rodgers, Kevin; gscriv...@gnu.org; dar...@gmail.com;
> bug-wget@gnu.org
 Subject: [bug #46161] make check fails due to missing
> library in tests/Makefile link command 
> Update of bug #46161 (project wget):
> 
>   Status:None => Invalid   
> 
 Assigned to:None => rockdaboot Fixed Release:
>None => 1.16.3 
> ___
> 
> Follow-up Comment #1:
> 
> Thanks for having a look.
> 
> But tests/Makefile is an auto-generated file (generated from Makefile.am).
> So it is definitely the wrong place to fix anything.
 
> And please always check the latest source code from git, the problem might
> have already been fixed.
 
> How to get the sources from git:
> https://savannah.gnu.org/git/?group=wget
> 
> After downloading the sources you should read README.checkout to proceed.
> If you still have problems to compile/link, feel free to open a new issue.
> 
> 
> ___
> 
> Reply to this item at:
> 
>   
> 
> ___
>   Message sent via/by Savannah
>   http://savannah.gnu.org/
> 




Re: [Bug-wget] Multi segment download

2015-08-29 Thread Tim Rühsen
Hi,

normally it makes much more sense when having several download mirrors and 
checksums for each chunk. The perfect technique for such is called 'Metalink' 
(more on www.metalinker.org).
Wget has it in branch 'master'. A GSOC project of Hubert Tarasiuk.

Additionally, Wget2 is under development, already having the option --chunk-
size (e.g. --chunk-size=1M) to start a multi-threaded download of a file.

Regards, Tim


Am Freitag, 28. August 2015, 15:41:27 schrieb Random Coder:
 On Fri, Aug 28, 2015 at 3:06 PM, Ander Juaristi ajuari...@gmx.es wrote:
  Hi,
  
  Would you point us to some potential use cases? How would a Wget user
  benefit from such a feature? One of the best regarded feature of download
  managers is the ability to resume paused downloads, and that's already
  supported by Wget. Apart from that, I can't come across any other use
  case. But that's me, maybe you have a broader overview.
 One possible feature, described in flowery language from a product
 description: ... splits files into several sections and downloads
 them simultaneously, allowing you to use any type of connection at the
 maximum available speed. With FDM download speed increases, or even
 more!
 
 And, just show this can help, at least in some situations, here's an
 example using curl (sorry, I don't know how to do a similar request in
 wget).  First a normal download of the file:
 
 curl -o all http://mirror.internode.on.net/pub/test/100meg.test
 
 This command takes an average of 48.9 seconds to run on my current
 network connection.  Now, if I split up the download as the download
 manager will, and run these four commands at the same instant:
 
 curl -o part1 -r0-2500
 http://mirror.internode.on.net/pub/test/100meg.test curl -o part2
 -r2501-5000
 http://mirror.internode.on.net/pub/test/100meg.test
 curl -o part3 -r5001-7500
 http://mirror.internode.on.net/pub/test/100meg.test
 curl -o part4 -r7501-
 http://mirror.internode.on.net/pub/test/100meg.test
 
 The union of time it takes all four commands to run ends up being an
 average of 19.9 seconds over a few test runs on the same connection.
 There's some penalty here because I need to spend time combining the
 files afterwards, but if the command supported this logic internally,
 no doubt much of that work could be done up front as the file is
 downloaded.


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Unit test case for parse_content_range()

2015-08-29 Thread Tim Rühsen
Am Samstag, 29. August 2015, 23:13:23 schrieb Darshit Shah:
 I've written a unit test for the parse_content_range() method.
 However, I haven't yet populated it with various test cases.
 Sharing the patch for the unit test here. I will add more test cases
 for this test later.
 
 Kindly do review the patch. If no one complains, I'll push it in a
 couple of days.

Hi Darshit,

some of the 'valid' tests

0-max
{ bytes 0-1000/1000, 0, 1000, 1000}
non0-max
{ bytes 1-1000/1000, 1, 1000, 1000}
0-valid
{ bytes 0-500/1000, 0, 500, 1000}
non0-valid
{ bytes 1-500/1000, 1, 500, 1000}
0-(max-1)
{ bytes 0-999/1000, 0, 999, 1000}
non0-(max-1)
{ bytes 1-999/1000, 1, 999, 1000}

And please add some tests using =2^31 and =2^32 as values.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Lint

2015-08-22 Thread Tim Rühsen
Am Samstag, 22. August 2015, 02:21:10 schrieb grarpamp:
 NEWS:122:** Accept the arguments --accept-reject and --reject-regex.
 
 s/reject /regex /

Thanks, fixed.

Tim




Re: [Bug-wget] [PATCH] Clarify that links are being converted.

2015-08-21 Thread Tim Rühsen
Pushed, thanks.

Tim

Am Samstag, 22. August 2015, 02:29:05 schrieb Jookia:
 * src/convert.c: Add 'links in' after 'Converted %d' and 'Converting %s'.
 ---
  src/convert.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/src/convert.c b/src/convert.c
 index 6d78945..f0df9a0 100644
 --- a/src/convert.c
 +++ b/src/convert.c
 @@ -193,7 +193,7 @@ convert_all_links (void)
convert_links_in_hashtable (downloaded_css_set, 1, file_count);
 
secs = ptimer_measure (timer);
 -  logprintf (LOG_VERBOSE, _(Converted %d files in %s seconds.\n),
 +  logprintf (LOG_VERBOSE, _(Converted links in %d files in %s
 seconds.\n), file_count, print_decimal (secs));
 
ptimer_destroy (timer);
 @@ -221,7 +221,7 @@ convert_links (const char *file, struct urlpos *links)
struct urlpos *link;
int to_url_count = 0, to_file_count = 0;
 
 -  logprintf (LOG_VERBOSE, _(Converting %s... ), file);
 +  logprintf (LOG_VERBOSE, _(Converting links in %s... ), file);
 
{
  /* First we do a dry run: go through the list L and see whether


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] bad filenames (again)

2015-08-21 Thread Tim Rühsen
Am Freitag, 21. August 2015, 17:28:09 schrieb Andries E. Brouwer:
 On Fri, Aug 21, 2015 at 04:34:36PM +0200, Tim Ruehsen wrote:
  On Friday 21 August 2015 14:22:22 Andries E. Brouwer wrote:
   Let me find some random site. Say
   http://web2go.board19.com/gopro/go_view.php?id=12345
  
  The server tell us the document is UTF-8.
  The document tell us it is 'UTF-8.
 
 And it is not. So - this example establishes that remote character set
 information, when present, is often unreliable.
 
 Let me add one more example,
 
 http://www.win.tue.nl/~aeb/linux/lk/r%f8dgr%f8d.html
 
 a famous Danish recipe. The headers say Content-Type: text/html
 without revealing any character set.

1. There is no URL to parse in this document, so encoding does not matter 
anyway.

2. If the server AND the document do not explicitly specify the character 
encoding, there still is one - namely the default. Has been ISO-8859-1 a while 
ago. AFAIR, HTML5 might have changed that (too late for me now to look it up).

The is a good diagram - maybe not perfectly up-to-date but it still shows 
roughly how to operate:
http://nikitathespider.com/articles/EncodingDivination.html

 
   Moreover, the character set of a filename is in general unrelated
   to the character set of the contents of the file. That is most clear
   when the file is not a text file. What character set is the filename
   
   http://www.win.tue.nl/~aeb/linux/lk/kn%e4ckebr%f6d.jpg
  
  Wrong question. It is a JPEG file. Content doesn't matter to wget.
 
 Hmm. I thought the topic of our discussion was filenames and character sets.
 Here is a file, and its name is in ISO 8859-1.
 When wget saves it. What will the filename be?
 
  If you want to download the above mentioned web page and
  you have a UTF-8 locale, you have to tell wget via --local-encoding what
  encoding the URL is.
 
 Are you sure you do not mean --remote-encoding?

Yes, I am sure. Here my tests (my locale is UTF-8):

Wrong:
$ wget -nv --remote-encoding=iso-8859-1 
http://www.win.tue.nl/~aeb/linux/lk/kn%e4ckebr%f6d.jpg
2015-08-21 20:09:30 URL:http://www.win.tue.nl/~aeb/linux/lk/kn%e4ckebr%f6d.jpg 
[11690/11690] - kn�ckebr�d.jpg.1 [1]

Right:
http://www.win.tue.nl/~aeb/linux/lk/kn%C3%A4ckebr%C3%B6d.jpg:
2015-08-21 20:14:18 FEHLER 404: Not Found.
2015-08-21 20:14:18 URL:http://www.win.tue.nl/~aeb/linux/lk/kn%e4ckebr%f6d.jpg 
[11690/11690] - knäckebröd.jpg [1]


 But whatever you mean, it is an additional option.
 If the wget user already knows the character set, she can of course tell
 wget.
 
 The discussion is about the situation where the user does not know.
 
 So, that is the situation we are discussing: a remote site, the user
 does not know what encoding is used (she will find out after downloading),
 and the headers have either no information or wrong information.
 Now if one invokes iconv it is likely that garbage will be the result.


 Here a Korean example.
 http://cfile204.uf.daum.net/attach/1847B5314CF754B83134B7
 The http headers say Content-Type: text/plain; charset=iso-8859-1
 (which is incorrect), an internal header says that this is ISO-2022-KR
 (which is also incorrect), in fact the content is in EUC-KR.
 That is none of wget's business, we want to save this file.
 The headers say
 Content-Disposition: attachment;
 filename=20101202_%EB%86%8D%EC%8B%AC%EC%8B%A0%EB%9D%BC%EB%A9%B4%EB%B0%B0_%
 EB%B0%94%EB%91%91(%EB%8B%A4%EC%B9%B4%EC%98%A4%EC%8B%A0%EC%A7%809%EB%8B%A8-%E
 B%B0%B1_.sgf This encodes a valid utf-8 filename, and that name should be
 used. So wget should save this file under the name
 20101202_농심신라면배_바둑(다카오신지9단-백_.sgf

This is a different issue. Here we are talking about the encoding of HTTP 
headers, especially 'filename' values within Content-Disposition HTTP header.
The above is correctly encoded (UTF-8 percent encoding).

The encoding is described in RFC5987 (Character Set and Language Encoding for
 Hypertext Transfer Protocol (HTTP) Header Field Parameters).

Wget simply does not parse this correctly - it is just not coded in.
That is why support for Content-Disposition in Wget is documented as 
'experimental' (you have to explicitly enable it via --content-disposition).

Again the server encoding is known. Regarding filename encoding, nothing is 
wrong in your example. It is just Wget missing some code here (worth opening a 
separate bug).


Default Wget behavior:
$ wget -nv http://cfile204.uf.daum.net/attach/1847B5314CF754B83134B7
2015-08-21 20:20:05 
URL:http://cfile204.uf.daum.net/attach/1847B5314CF754B83134B7 [1441/1441] - 
1847B5314CF754B83134B7 [1]


Enabled Content-Disposition support:
$ wget -nv --content-disposition 
http://cfile204.uf.daum.net/attach/1847B5314CF754B83134B7
2015-08-21 20:23:50 
URL:http://cfile204.uf.daum.net/attach/1847B5314CF754B83134B7 [1441/1441] - 
20101202_%EB%86%8D%EC%8B%AC%EC%8B%A0%EB%9D%BC%EB%A9%B4%EB%B0%B0_%EB%B0%94%EB%91%91(%EB%8B%A4%EC%B9%B4%EC%98%A4%EC%8B%A0%EC%A7%809%EB%8B%A8-%EB%B0%B1_.sgf
 
[1]

As we see, unescaping and UTF-8 to locale 

Re: [Bug-wget] [PATCH] Removed useless TODOs.

2015-08-21 Thread Tim Rühsen
Pushed, thanks.

Tim

Am Samstag, 22. August 2015, 01:41:44 schrieb Jookia:
  * testenv/Test--reject-log.py: Remove TODOs.
 ---
  testenv/Test--rejected-log.py | 2 --
  1 file changed, 2 deletions(-)
 
 diff --git a/testenv/Test--rejected-log.py b/testenv/Test--rejected-log.py
 index ef72794..7c6c4c4 100755
 --- a/testenv/Test--rejected-log.py
 +++ b/testenv/Test--rejected-log.py
 @@ -78,8 +78,6 @@ Files = [[index_html, secondpage_html, thirdpage_html,
 robots_txt, dummy_txt]]
 
  ExpectedReturnCode = 0
  ExpectedDownloadedFiles = [index_html, secondpage_html, thirdpage_html,
 robots_txt, log_csv] -# TODO: fix long line
 -# TODO: check names
 
   Pre and Post Test Hooks
 # pre_test = {



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] afl-fuzz'ing wget?

2015-08-15 Thread Tim Rühsen
Am Samstag, 15. August 2015, 23:08:03 schrieb Jacek Wielemborek:
 W dniu 15.08.2015 o 22:23, Tim Rühsen pisze:
  Am Samstag, 15. August 2015, 12:29:45 schrieb Jacek Wielemborek:
  Hello,
  
  I was looking into fuzzing wget with afl-fuzz [1]. While I hadn't
  managed to crash it yet, I found a lot of code paths so far with the
  
  following input:
  HTTP/1.1 200 OK
  Server: nginx
  Date: Mon, 10 Aug 2015 20:31:38 GMT
  Content-Type: text/html; charset=utf-8
  Content-Length: 283087
  Connection: keep-alive
  Vary: Accept-Encoding
  cache-control: no-cache
  
  
  qwe
  
  Hi Jacek,
  
  what exactly did you find ?
  
  Maybe you can give us an example wget command line that produces
  unexpected
  behavior. Or better, give us a pointer to the code that fails.
  We highly appreciate patches to wget (non-trivial patches needs an FSF
  copyright assignment by you).
  
  Looking forward to hear from you.
  
  Tim
 
 Hello,
 
 I found nothing because I was only testing it under a netbook so far,
 but I wanted to know if it was tested before and if not, encourage you
 people to do that by giving some pointers on how this can be achieved.
 I'll let you know once I find anything.

I am not sure how afl-fuzz handles bidirectional communication resp. how the 
input files have to look like. Try to simulate/test a FTP connection - this is 
a sequence of input and output. If you get this working, --mirror resp. -r 
should be straight forward. There are examples (including HTML documents) in 
tests/ and testenv/ directories.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] recur.c compile error

2015-08-15 Thread Tim Rühsen
Am Samstag, 15. August 2015, 12:15:46 schrieb Darshit Shah:
 On Sat, Aug 15, 2015 at 1:33 AM, Gisle Vanem gva...@yahoo.no wrote:
  The new reject stuff in recur.c:
typedef enum
{

  SUCCESS, BLACKLIST, NOTHTTPS, NONHTTP, 1, 1, PARENT, LIST, REGEX,
  RULES, SPANNEDHOST, ROBOTS

} reject_reason;
  
  causes errors with MSVC and MingW since in:
math.h:952:#define DOMAIN  _DOMAIN i.e. 1
wingdi.h:1893: #define ABSOLUTE1
  
  math.h is pulled in via some Gnulib headers. And wingdi.h via
  windows.h. I suggest this simple fix:
  
  --- a/src/recur.c   2015-08-14 21:45:44
  +++ b/src/recur.c   2015-08-14 21:54:45
  @@ -182,6 +182,9 @@
  
 return ret;
   
   }
  
  +#undef ABSOLUTE
  +#undef DOMAIN
  +
  
   typedef enum
   {
   
 SUCCESS, BLACKLIST, NOTHTTPS, NONHTTP, ABSOLUTE, DOMAIN, PARENT, LIST,
  
  REGEX,
 
 I think this is an ugly solution.
 
  Or better names for the enumerations; 'RR_xx' ?
 
 This is much better. We should rename our constants to something like WG_*.

ACK.

Maybe WG_RR_* ?



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] afl-fuzz'ing wget?

2015-08-15 Thread Tim Rühsen
Am Samstag, 15. August 2015, 12:29:45 schrieb Jacek Wielemborek:
 Hello,
 
 I was looking into fuzzing wget with afl-fuzz [1]. While I hadn't
 managed to crash it yet, I found a lot of code paths so far with the
 
 following input:
  HTTP/1.1 200 OK
  Server: nginx
  Date: Mon, 10 Aug 2015 20:31:38 GMT
  Content-Type: text/html; charset=utf-8
  Content-Length: 283087
  Connection: keep-alive
  Vary: Accept-Encoding
  cache-control: no-cache
  
  
  qwe

Hi Jacek,

what exactly did you find ?

Maybe you can give us an example wget command line that produces unexpected 
behavior. Or better, give us a pointer to the code that fails.
We highly appreciate patches to wget (non-trivial patches needs an FSF 
copyright assignment by you).

Looking forward to hear from you.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] bad filenames (again)

2015-08-13 Thread Tim Rühsen
Am Donnerstag, 13. August 2015, 19:33:56 schrieb Andries E. Brouwer:
 After git clone, one gets a wget tree without autogenerated files.
 README.checkout tells one to run ./bootstrap to generate configure.
 
 But:
 
 $ ./bootstrap
 ./bootstrap: Bootstrapping from checked-out wget sources...
 ./bootstrap: consider installing git-merge-changelog from gnulib
 ./bootstrap: getting gnulib files...
 ...
 
 running: AUTOPOINT=true LIBTOOLIZE=true autoreconf --verbose --install
 --force -I m4  --no-recursive autoreconf: Entering directory `.'
 autoreconf: running: true --force
 autoreconf: running: aclocal -I m4 --force -I m4
 configure.ac:498: warning: macro 'AM_PATH_GPGME' not found in library
 autoreconf: configure.ac: tracing
 autoreconf: configure.ac: not using Libtool
 autoreconf: running: /usr/bin/autoconf --include=m4 --force
 configure.ac:93: error: possibly undefined macro: AC_DEFINE
   If this token and others are legitimate, please use m4_pattern_allow.
   See the Autoconf documentation.
 configure.ac:498: error: possibly undefined macro: AM_PATH_GPGME
 autoreconf: /usr/bin/autoconf failed with exit status: 1
 ./bootstrap: autoreconf failed

Yes sorry, that is a recent issue with metalink. Darshit works on that.

You have to install libgpgme11-dev (Or similar name).

Tim




Re: [Bug-wget] Metalink support

2015-07-04 Thread Tim Rühsen
Am Freitag, 3. Juli 2015, 18:15:27 schrieb Anthony Bryan:
  $ wget
  --input-metalink=http://www.metalinker.org/samples/dsl-3.3.iso.metalink
 
 I'm not sure what the typical wget user expects, but here are some examples.
 
 if you keep the current behavior, it might be helpful to print to the
 user that they've downloaded a metalink, and how to make wget use it.
 
 aria2 (another command line metalink downloader) uses these commands
 to download the metalink and process it (download the file described
 by it  do a hash check) by default. you have to specify
 '--follow-metalink=false' to only download the metalink XML file.
 
 curl on the other hand just downloads the metalink XML file by default
 (like any other file), and requires a '--metalink' option to process a
 local file or URL. (examples  help/man pages follow)

The above example is not promoted as a metalink file :-(
Wget's behaviour is absolutely perfect in this case:

HTTP/1.1 200 OK
Server: Apache/2.4.12
Last-Modified: Thu, 28 Feb 2013 23:50:21 GMT
ETag: 25d4-4d6d18dc2d087
Content-Type: application/x-iso9660-image
Accept-Ranges: bytes
Date: Sat, 04 Jul 2015 13:47:34 GMT
Connection: keep-alive
Via: 1.1 varnish
Age: 0
Content-Length: 9684


Only if it were a 
Content-Type: application/metalink+xml 
or
Content-Type: application/metalink4+xml 

the target should be automatically loaded... IMO.

Anthony, why does the server promote the file a x-iso9660-image ?
Is it a bug or a feature ?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Metalink support

2015-07-03 Thread Tim Rühsen
Am Freitag, 3. Juli 2015, 20:23:00 schrieb Hubert Tarasiuk:
 W dniu 03.07.2015 o 17:12, Tim Ruehsen pisze:
  To copy GPGME m4 macro, I have to install libgpgme11-dev. And if I do
  that, a ./bootstrap (maybe just a configure) does the job for me.
 
 Actually I meant copying over the m4 GPGME macro and adding it to Wget's
 repository, so that it would be available straight after cloning (I have
 seen some projects including gpgme.m4 in their repositories to resolve
 the problem).
 Or maybe we could provide some kind of wrapper that would call the
 proper GPGME macro if it is defined elsewhere, and otherwise do nothing.
 (Not sure how to do that though.)

Let me take a closer look at that in the next days.


  I tried
  $ wget http://www.metalinker.org/samples/dsl-3.3.iso.metalink
  which I understand as a download link to dsl-3.3.iso.
  
  That only downloads the .metalink file and that's it !?
  I expected Wget to download dsl-3.3.iso (of course first the .metalink
  file as intermediate file, which is needed for mirror information).
  I am aware of --input-metalink, but would like to see a bit more of
  automation in the above example.
 
 I kind of agree with that. But we should probably not change the default
 behaviour of previous versions. How about this:

What is 'previous versions' ? If you are talking about the wget-parallel 
branch, it never became an official release. So there is no 'default' 
behaviour defined as such that we have to take care for.

 $ wget
 --input-metalink=http://www.metalinker.org/samples/dsl-3.3.iso.metalink
 
 Would that work for you? (Allowing to provide an URL instead of local
 file to the input-metalink option.)

Of course it works for me, but we are not talking about me :-)

The main question is: what does the typical wget user expect ?
I would thinks (s)he expects the download of the file described by the 
.metalink file. Of course a few users really want to have only the .metalink 
file (for testing/inspection). So there should be a possibility to do exactly 
that (e.g. --keep-metalink).

Before making any decision/action we should wait for some other voices.

 
  WDYT ?
  
  Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Behaviour of spanning to accepted domains

2015-06-07 Thread Tim Rühsen
Am Sonntag, 7. Juni 2015, 08:19:28 schrieb Tony Lewis:
 On Friday, June 05, 2015 1:24 PM, Tim Rühsen wrote:
   First, I have not dug into the source code to see how -H is implemented.
   However, it makes sense to me that one ought to be able to specify
   both -H and -D together.
  
  -H (=all domains)
  to exclude some sites use --exclude-domains domain-list
 
 wget --help says about -H: go to foreign hosts when recursive.
 
 It doesn't say that when using -H one *must* take every foreign host that
 exists on the Internet and I'm arguing that such an interpretation does not
 make sense.

That is what -H is for :-)
Well, not *every* foreign host, but *every* foreign host that appears in 
downloaded, parsable files (HTML and CSS files).

wget --help just gives a short help, not a full description. See 'man wget' 
for the extended description. If there is something unclear, we should fix it.

Using -H always has the chance to 'download the whole internet'. That's 
normally not what you want and thus -H is not enabled by default.

 
 One ought to be able to request that wget go to foreign hosts without that
 implying that wget mirror the entire Internet. One obvious way to limit
 which foreign hosts are mirrored is to use -H in combination with -D.
 
   Consider this scenario: I want to mirror a site including the images
   that are stored in a sub-domain, but I don't want to mirror every
   external site referenced by the site. So I would try this:
   
   wget --mirror http://www.somesite.com -H -D www.somesite.com
   images.somesite.com
  
  You can also play with:
-A acclist --accept acclist
-R rejlist --reject rejlist
 
 I can play with lots of wget options, but in the scenario described I want
 *all* files from two hosts, but not every other foreign host that might be
 referenced by one of those hosts.
 
 What command line would you use for the scenario described?

Let's say you want all from the two hosts example1.com and example2.com:

wget --mirror example1.com example2.com

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] TCP Fast Open for HTTP

2015-05-28 Thread Tim Rühsen
Am Donnerstag, 28. Mai 2015, 18:51:52 schrieb Giuseppe Scrivano:
 Hubert Tarasiuk hubert.taras...@gmail.com writes:
  W dniu 28.05.2015 o 10:26, Giuseppe Scrivano pisze: Hubert Tarasiuk
  
  I wouldn't even have an option for --tcp-fast-open and avoid adding such
  low level details to the command line.  I would rather go for an env
  variable like TCP_FAST_OPEN=1, that wget might honor or not and not
  worry where it is not supported.
  
  What do you specifically mean by not worry where it is not supported?
 
 I was just thinking loudly.  We are not going to support it on all
 platforms and I would like that the wget configurations work more or
 less portably.  An alternative would be to just warn the user if it is
 not supported (or even silently skip it).
 
 After looking at the code, I think we should consider some performance
 numbers before we decide to include TFO in wget, to counter-balance
 the cost of the increased code paths we have to deal with.

Performance numbers:
https://reproducingnetworkresearch.wordpress.com/2014/06/03/cs244-14-tcp-fast-open-2/

In short: the faster the servers are, the higher the bandwidth is and the 
longer the distance (client-server) is, the more does TFO matter. At least 
the servers are getting faster with time, also the bandwidth increases with 
time.

BTW, an alternative might be QUIC (http://en.wikipedia.org/wiki/QUIC). QUIC 
approaches the same problem (RTT). But it seems far from being standardized 
though there seems support in Apache and Nginx.

 I would leave away the configure option --enable-tcp-tfo. Instead just use 
the 
 constants defined in sys/socket.h, as Hubert already does.
 Why would you prefer to leave away that option from configure?

I said this because i couldn't imagine that someone wants to compile without 
TFO support, when TFO is supported. But I can't deny Giuseppe's above 
argument, so I am fine with a configure option. It really doesn't hurt.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Conditional GET requests

2015-05-18 Thread Tim Rühsen
Hi Hubert,

you patches look very good now.

I tested them and had a quick look on the changes.

Just could find these very minor points:

In file included from http.c:32:0:
wget.h:335:32: warning: comma at end of enumerator list [-Wpedantic]
   IF_MODIF_SINCE   = 0x0080,/* use if-modified-since header */

Is there any reason to abbreviate MODIFIED to MODIF ? If not, 
IF_MODIFIED_SINCE is slightly more readable, at least to me.
Same goes to the opt member variable.

Regards, Tim

Am Montag, 18. Mai 2015, 13:38:33 schrieb Hubert Tarasiuk:
 Sorry, I found and fixed another spelling error.
 
 W dniu 18.05.2015 o 13:11, Hubert Tarasiuk pisze:
  I have reworked my patches. Specifically:
  1) --if-modified-since option is enabled by default and has only effect
  in timestamping mode. And yes, --no-if-modified-since is added
  automatically.
  2) I added all legal date formats to my test.
  3) I added another case to my test (local file is strictly newer than
  remote file).
  3) If time_to_rfc1123 fails, there is simple fall back behavior.
  4) I added work around behavior for servers ignoring If-Modified-Since
  (like for example our Perl test server).
  
  Patches are attached here as well as on Github for easy viewing.
  https://github.com/jy987321/Wget/commits/master-hubert
  
  Thank you,
  Hubert
  
  W dniu 14.05.2015 o 22:35, Hubert Tarasiuk pisze:
  W dniu 14.05.2015 o 21:12, Tim Rühsen pisze:
  Am Donnerstag, 14. Mai 2015, 15:43:54 schrieb Hubert Tarasiuk:
  W dniu 13.05.2015 o 13:28, Ander Juaristi pisze:
  And second, I'm not really sure whether --condget is the best name for
  the switch.
  Requests that include any of If-Unmodified-Since, If-Match,
  If-None-Match, or If-Range
  header fields are also conditional GETs as well.
  We might want to implement one of those in the future and we'd be
  forced
  to choose a name which could easily be
  inconsistent/confusing with --condget. Or maybe we won't. But we don't
  know that now, so I think
  it's better to choose a switch more specific to the fact that an
  If-Modified-Since header will be sent
  so as to avoid confusion.
  
  Do you have an idea for a better switch name that would not be too
  long?
  I have noticed that issue earlier, but could not think of a better name
  that would not be too long. :D
  
  Thank you for the suggestions,
  
  Hi Hubert,
  
  why not --if-modified-since as a boolean option ?
  
  Sounds good.
  
  I personally would set it to true by default, since it is a very
  common/basic HTTP 1.1 header.
  
  Ok, I will name the option --no-if-modified-since and will enable that
  by default.
  
  Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Contribution on bug #45037

2015-05-17 Thread Tim Rühsen
Am Sonntag, 17. Mai 2015, 00:56:50 schrieb Ángel González:
 On 16/05/15 22:10, Ander Juaristi Alamos wrote:
  Hello,
  
  I found another task where I want to contribute, the issue with 45037.
  For what I understand, and read in the mailling list, the problem occur
  inftp.c ftp_loop_internal -  remove_link.
  
  But what I don't understand, is why each time he try to remove
  the symbolic link. What is the purpose of this statement, in overall
  context of code. 
  As Tim pointed out, the purpose is not to overwrite the content of the
  file the link points to. If README was a regular file and then you passed
  -O README, it's okay to overwrite it (I don't know what's Wget default
  behavior here, if it prompts the user or whatever), because the user is
  supposed to be aware of such action. But if README was a symlink, passing
  -O README could be a malicious action, as Angel said. If you want an
  example of such an action and its consequences, google for Symlink
  race.
  
  That was here, because that was a feature, and now a bug, because some
  others statements was removed ?
  
  If for example I have a symbolic link README in local, and try to
  download another file, with the same name README remotely, with -O foo,
  normally, my symbolic link README should not be modified. 
  Exactly. Looking at the code, my guess is that the person who wrote it
  didn't take into account that the user could have passed the -O option
  (or maybe it wasn't yet implemented then, who knows). He or she just
  picked the name of the downloaded file.
 
 The -O option already existed in d5be8ecc (and probably in Geturl, too).
 
  But now if I pass -O README, the symbolic link will be removed because
  it's a redirection ? so It's not possible to just renamed it ?
  
  It sounds like that's exactly what should be done. Treat the symlink as if
  it was a local file: if a symlink exists with the same name as the
  downloaded file, rename the downloaded file to README.1, for instance.
 Actually, instead of _renaming_ it, you should just create it with the
 right name.
 
  Sorry if my questions are silly, but I'am starting in the code of wget,
  and my knowledge are weak. 
  Something this project has taught me is that there's no silly question.
  It's better to ask even though you're not really sure (and you've made
  some research on your own before :D) than keep quiet and make wrong
  assumptions by yourself.
 +1 Perfectly explained, Ander

ACK

 Loïc, the interesting thing is that the rest of us don't really know why
 things where done that way, either. That piece of code has been there
 since the first svn revision, in 1999 (wget 1.5.3) and very few people
 contributing back then are still active nowadays. Moreover, much more
 difficult than being there is to remember exactly why it was added!

Thank you Ángel for digging the history up.

What Loïc needs it a way to go. Else he can't fix anything.

How about changing the FTP code behavior so that it exactly works the same as 
the HTTP code ? That is what the user expects. Or are there any 'riddles' 
regarding symlink HTTP code behavior ?

WDYT ?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Conditional GET requests

2015-05-15 Thread Tim Rühsen
Am Freitag, 15. Mai 2015, 12:51:16 schrieb Ander Juaristi:
 On 05/14/2015 10:35 PM, Hubert Tarasiuk wrote:
  Ok, I will name the option --no-if-modified-since and will enable that
  by default.
 
 Why not implement both ie. '--if-modified-since' and
 '--no-if-modified-since', and assert the former by default?
 
 I might have it disabled by default by putting ifmodifiedsince = off in
 .wgetrc, and if I wanted to enable it once, I would pass the
 '--if-modified-since' switch to override the behavior.

Long boolean option do automatically accept a --no-... don't they ?

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Conditional GET requests

2015-05-15 Thread Tim Rühsen
Am Freitag, 15. Mai 2015, 19:23:41 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
  would not be better to not enable it by default?  At least this is what
  we do with --timestamping and having --if-modified-since by default will
  break the use case of downloading successive files as .1, .2, .3
  
  One GET including If-modified-since simply should replace the two requests
  HEAD + GET (or one HEAD if this says 'no update'). Why should the backup
  strategy of Wget change ? Maybe I am just wrong... please explain a bit
  more in detail.
 
 so you mean when -N is used?
 
 I thought that enable by default meant in any case (so to imply -N
 also when not specified), if not, then I agree that it can be the
 default.

Right, I meant only when -N is enabled. Sorry for confusion.

Tim


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] HTTP/2 and HPACK specs published by IETF

2015-05-15 Thread Tim Rühsen
HTTP/2 https://tools.ietf.org/html/rfc7540
HPACK https://tools.ietf.org/html/rfc7541

Have fun !

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Conditional GET requests

2015-05-15 Thread Tim Rühsen
Am Freitag, 15. Mai 2015, 14:15:31 schrieb Ander Juaristi:
 On 05/15/2015 01:26 PM, Tim Rühsen wrote:
  Long boolean option do automatically accept a --no-... don't they ?
 
 I don't know. It might be. To be honest, I didn't think of it. Sorry if I
 screwed up.

I wasn't 100% sure either. Please don't mind to screw things up... If 
something is unclear to you, you are normally not the only one. There is no 
problem except my neutral writing ... I didn't mean to sound rude or 
something like that, please excuse.

 
 Just wanted to point it out, anyway. To make sure that no one had forgotten
 .wgetrc.

That is the right attitude and absolutely positive. .wgetrc is often not 
thought of, as you can see within Wget's option parsing design. Try to set a 
string value back to NULL if you set it to something in .wgetrc :-(

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Conditional GET requests

2015-05-15 Thread Tim Rühsen
Am Freitag, 15. Mai 2015, 14:38:08 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
  Am Donnerstag, 14. Mai 2015, 15:43:54 schrieb Hubert Tarasiuk:
  W dniu 13.05.2015 o 13:28, Ander Juaristi pisze:
   And second, I'm not really sure whether --condget is the best name for
   the switch.
   Requests that include any of If-Unmodified-Since, If-Match,
   If-None-Match, or If-Range
   header fields are also conditional GETs as well.
   We might want to implement one of those in the future and we'd be
   forced
   to choose a name which could easily be
   inconsistent/confusing with --condget. Or maybe we won't. But we don't
   know that now, so I think
   it's better to choose a switch more specific to the fact that an
   If-Modified-Since header will be sent
   so as to avoid confusion.
  
  Do you have an idea for a better switch name that would not be too long?
  I have noticed that issue earlier, but could not think of a better name
  that would not be too long. :D
  
  Thank you for the suggestions,
  
  Hi Hubert,
  
  why not --if-modified-since as a boolean option ?
  
  I personally would set it to true by default, since it is a very
  common/basic HTTP 1.1 header.
 
 would not be better to not enable it by default?  At least this is what
 we do with --timestamping and having --if-modified-since by default will
 break the use case of downloading successive files as .1, .2, .3

One GET including If-modified-since simply should replace the two requests 
HEAD + GET (or one HEAD if this says 'no update'). Why should the backup 
strategy of Wget change ? Maybe I am just wrong... please explain a bit more 
in detail.

Regards, Tim



Re: [Bug-wget] Conditional GET requests

2015-05-14 Thread Tim Rühsen
Am Donnerstag, 14. Mai 2015, 15:35:29 schrieb Hubert Tarasiuk:
 W dniu 13.05.2015 o 10:24, Tim Ruehsen pisze:
  Hi Hubert,
  
  nice to see your work... it looks very good to me.
  
  Just from a quick first glimpse, there are a few small points:
  
  - please avoid abort() (found in time_to_rfc1123()). The function returns
  RETROK on success but calls abort() on failure. This might end up in a
  frustrating user experience. On error, you could fall back to not using
  if-
  modified-since at all, fall back to using HEAD or fall back to use a time
  value of 0 (corresponding to 1.1.1970 00:00:00). Or whatever you think is
  appropriate.
 
 We could do that but I am not sure that it would be good solution. Are
 there any cases when gmtime would have a good reason to fail?
 I think that when a function like gmtime fails, it could mean that
 something is seriously wrong; and perhaps we should not do anything but
 crash in that case (just as we do in case of xmalloc, for example).
 What do you think?

In case of malloc failing we hardly can recover or continue proper work.
In the case of gmtime failing, we could easily recover and continue our work 
(and inform the user, that something weird is going on). This makes Wget more 
robust and more reliable.

I wouldn't make assumptions on reasons that causes gmtime to fail. It might be 
anything, implementations will differ from OS to OS and from library to 
library. I saw lot's of interesting things in ~30 years of software 
development. I remember seeing time() returning -1 (sporadically, some bug in 
the library code, i guess).

People try to compile and run Wget on almost *any* system, even on 20 or 30 
year old systems.

  - typo in Test-condget.py (usiong instead using) - have a look if your
  IDE/editor supports spell checking.
  
  - it would be nice if your test case tests all variants of HTTP-date (=
  rfc1123-date | rfc850-date | asctime-date)
 
 Do you mean to test the conversion Wget - HTTP or HTTP - Wget?
 In the former: Wget will currently issue dates exclusively in rfc1123
 format.
 In the latter: I agree that it should be tested (although the function
 that handles this conversion is not part of these patches, it had
 already been present in http.c (http_atotm)). Especially that the tests
 for -N do not test formats other than rfc1123, either (as far as I can
 see).

Right, Wget-Server is not interesting.

And your are right again, http_atotm() handles the different date formats.
AFAICS, there are no tests testing the different date formats. They belong 
into one of the *-N* tests and/or into your test. Best would be both, 
explicitly testing with HEAD and with Get+If-Modified-Since requests.
It would polish your test case, making it more complete.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] Extending Wget's git branching model

2015-05-14 Thread Tim Rühsen
Hi people,

we would like to discuss a slightly amended branching model for Wget with the 
community.

Taking a look at the past release model reveals some managing flaws regarding 
bugfix releases. After a release like e.g. 1.6.0, reported bugs are fixed and 
committed onto 'master'. At the same time new features and other code changes 
are also committed onto 'master'. Eventually we released 1.6.1 (1.6.2, ...) as 
a bugfix release... but as you can see, we tend to introduce new bugs when we 
changed code and/or added new features at the same time. This is not very nice 
for distribution maintainers when they try to create a 'stable' distribution.

Our idea is to create a new branch on each major release. While still all 
codes changes are committed into 'master'. Additionally each bugfix also 
becomes committed into the release branch. we cherry-pick each bugfix from 
master into the release branch. When bug reports settle down (or for other 
reasons like a CVE), we would eventually create a bugfix tag on the release 
branch.

There are much more sophisticated model for release maintenance, but we 
maintainers won't have too much time and prefer a model as-simple-as-possible 
(and as-complex-as-needed).

What do you think, what are your experiences, what are your ideas ?
Any input is appreciated.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Conditional GET requests

2015-05-14 Thread Tim Rühsen
Am Donnerstag, 14. Mai 2015, 15:43:54 schrieb Hubert Tarasiuk:
 W dniu 13.05.2015 o 13:28, Ander Juaristi pisze:
  And second, I'm not really sure whether --condget is the best name for
  the switch.
  Requests that include any of If-Unmodified-Since, If-Match,
  If-None-Match, or If-Range
  header fields are also conditional GETs as well.
  We might want to implement one of those in the future and we'd be forced
  to choose a name which could easily be
  inconsistent/confusing with --condget. Or maybe we won't. But we don't
  know that now, so I think
  it's better to choose a switch more specific to the fact that an
  If-Modified-Since header will be sent
  so as to avoid confusion.
 
 Do you have an idea for a better switch name that would not be too long?
 I have noticed that issue earlier, but could not think of a better name
 that would not be too long. :D
 
 Thank you for the suggestions,

Hi Hubert,

why not --if-modified-since as a boolean option ?

I personally would set it to true by default, since it is a very common/basic 
HTTP 1.1 header.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Task 44817 http/gzip compression

2015-05-14 Thread Tim Rühsen
Hi Loïc, hi Ander,

Loïc, very good that asked before you started a non-trivial task.
Are you aware of the GNU/Wget copyright assignment For non-trivial code 
contributions !?

Am Donnerstag, 14. Mai 2015, 15:50:52 schrieb Loïc Maury:
 Hi Ander,
 
 On Thu, May 14, 2015 at 3:44 PM, Ander Juaristi ajuari...@gmx.es wrote:
  On 05/14/2015 02:37 PM, Loïc Maury wrote:
  Hello,
  
   Hi Loic,
   
   I'am new to wget and I'am interested about the task 44817 -  http/gzip
   
  compression.
  I want to try to implement it, if no one else work on it.
  
  I'm not really sure, but I think Tim was working on some new features to
  merge on Wget
  arising from his project mget, one of which was gzip/bzip/lzma support.

While the code is up and running, it will take some time to generate a Wget2 
project out of it. It would introduce a new code base which won't integrate 
into Wget 1.x (at least I doubt it). Sorry, if I didn't made this clear to 
you, Ander.

 Ok, not a problem, maybe I can find another task to do.

If you are interested in Wget, there are many other tasks as well. Just tell 
us what direction you want to go... and we'll find an appropriate task for 
you. Of course, feel free to work on any of the bugs.

We are always in need for any help.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 3/3] Redirect containing %2B behaves differently depending on locale

2015-05-12 Thread Tim Rühsen
Am Mittwoch, 22. April 2015, 16:14:06 schrieb Ander Juaristi:
 On 04/22/2015 03:47 PM, Ander Juaristi wrote:
  On 04/21/2015 04:19 PM, Darshit Shah wrote:
  Regarding the patch itself, I wanted to ask if it would not be cleaner to
  dig into the code and replace every call to url_unescape with the new
  prototype? In my opinion that would help in maintaining readability and
  more importantly maintainability of the code. 
  I thought of it too, and I agree with you. The reason I haven't done it is
  because I'm not really sure whether all the functions that call
  url_unescape need the reserved characters escaped or not. I believe
  there'll be no problems, but I didn't want to just blindly replace all
  the calls to url_unescape without even having a quick look, which is
  exactly what I didn't have time to do so far. What do you guys think?
  I'll have a closer look as soon as I can (and provided no one does it
  before) and roll another patch with the replacements. Unless of course
  someone already knows the answer.
  
  Regarding the patches, I resend them with the changes made according to
  your feedback. 
  Changes made so far:
   - Merged the prototype patch into 1.
   - Shortened commit messages.
   - New test added to Makefile.am (in patch 2).
 
 Forgot to mention some files. Silly me :-(

Thanks Ander !

I pushed your patches.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [bug #20398] Save a list of the links that were not followed

2015-05-07 Thread Tim Rühsen
Hi Jookia,

if you want us to include your patch (and it is welcome of course), 
you have to sign a copyright assignment.

Please email the following information to ass...@gnu.org with a CC
to dar...@gmail.com, tim.rueh...@gmx.de and gscriv...@gnu.org, and we
will send you the assignment form for your past and future changes.


Please use your full legal name (in ASCII characters) as the subject
line of the message.
--
REQUEST: SEND FORM FOR PAST AND FUTURE CHANGES

[What is the name of the program or package you're contributing to?]


[Did you copy any files or text written by someone else in these changes?
Even if that material is free software, we need to know about it.]


[Do you have an employer who might have a basis to claim to own
your changes?  Do you attend a school which might make such a claim?]


[For the copyright registration, what country are you a citizen of?]


[What year were you born?]


[Please write your email address here.]


[Please write your postal address here.]





[Which files have you changed so far, and which new files have you written
so far?]





Am Donnerstag, 7. Mai 2015, 15:58:53 schrieb Jookia:
 Follow-up Comment #5, bug #20398 (project wget):
 
 I've found myself in need of this feature. I'm trying to download a website
 recursively without pulling in every single ad and its HTML. I'd like to be
 able to find out which URLs were rejected, why, and information about the
 domains (host, port, etc.)
 
 I've patched my copy of Wget to dump all of this in to a CSV file which I
 can then tool through to get my desired results:
 
 
 
 % grep DOMAIN rejected.csv | head -1
 DOMAIN,http://c0059637.cdn1.cloudfiles.rackspacecloud.com/flowplayer-3.2.6.m
 in.js,SCHEME_HTTP,c0059637.cdn1.cloudfiles.rackspacecloud.com,80,flowplayer-
 3.2.6.min.js,(null),(null),(null),http://redated/,SCHEME_HTTP,redacted,80,,(
 null),(null),(null) % grep DOMAIN rejected.csv | cut -d, -f4 | sort |
 uniq
 0.gravatar.com
 1.gravatar.com
 c0059637.cdn1.cloudfiles.rackspacecloud.com
 lh3.googleusercontent.com
 lh4.googleusercontent.com
 lh5.googleusercontent.com
 lh6.googleusercontent.com
 
 
 I've included a patch made in a few hours that does this.
 
 (file #33955)
 ___
 
 Additional Item Attachment:
 
 File name: 0001-rejected-log-Add-option-to-dump-URL-rejections-to-a-.patch
 Size:14 KB
 
 
 ___
 
 Reply to this item at:
 
   http://savannah.gnu.org/bugs/?20398
 
 ___
   Message sent via/by Savannah
   http://savannah.gnu.org/

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] GSoC15: Speed up Wget's Download Mechanism

2015-04-30 Thread Tim Rühsen
Am Donnerstag, 30. April 2015, 18:45:05 schrieb Daniel Stenberg:
 On Thu, 30 Apr 2015, Tim Ruehsen wrote:
  Originally, Gisle talked about CPU cycles, not elapsed time.
  That is quite a difference...
 
 Thousands of cycles per invoke * many invokes = measurable elapsed time

Again: That is quite a difference...

1Ghz CPU: 1cycle~1ns, means 1000*1ns = 1us (microsecond). 
But if one packet comes 10ms later (pretty normal on the network) that would 
be equal to ~10 million cycles (equal to about 10.000 calls to 
run_with_timeout, if Gisle's assumptions are right).

How could you distinguish these two, latency and wasted cycles ?

On Linux even the 'time' command is helpful here (if your downloadable is 
large enough to generate a CPU cycle footprint  few ms).
Much better is 'valgrind --tool=callgrind' plus a tool like kcachegrind.

But Gisle is on Windows... I don't know what tools are available there.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] avoiding a large number of HEAD reqs when resuming

2015-04-30 Thread Tim Rühsen
Am Donnerstag, 30. April 2015, 12:04:18 schrieb User Goblin:
 The situation: I'm trying to resume a large recursive download of a site
 with many files (-r -l 10 -c)
 
 The problem: When resuming, wget issues a large number of HEAD requests
 for each file that it already downloaded. This triggers the upstream
 firewall, making the download impossible.
 
 My initial idea was to parse wget's -o output and figure out which files
 still need to be downloaded, and then feed them via -i when continuing the
 download. This led me to the conclusion that I'd need two pieces of
 functionality, (1) machine-parseable output of -o, and (2) a way to convert
 a partially downloaded directory structure to links that still need
 downloading.
 
 I could work around (1), the output of -o is just hard to parse.
 
 For (2), I could use lynx or w3m or something like that, but then I never
 am sure that the links produced are the same that wget produced. Therefore
 I'd love an option like `wget --extract-links ./index.html` that'd just
 read an html file and produce a list of links on output. Or perhaps an
 assertion that some other tool like urlscan will do it exactly the same way
 as wget.
 
 There's a third idea that we discussed on IRC with darnir, namely having
 wget store its state when downloading. That would solve the original problem
 and would be pretty nice. However, I'd still like to have (1) and (2) done,
 because I'm also thinking of distributing this large download to a number
 of IP addresses, by running many instances of wget on many different
 servers (and writing a script that'd distribute the load).
 
 Thoughts welcome :-)

The top-down approach would be something like

wget -r --extract-links | distributor host1 host2 ... hostN

'distributor' is a program that start one instance of wget on each host given, 
taking the (absolute) URLs via stdin, and give it to the wget instances (e.g. 
via round-robin... better would be to know wether a file download has been 
finished).

I assume '-r --extract-links' does not download, but just recursive 
scans/extracts the existing files !?

Wget also has to be adjusted to start downloading immediately on the first URL 
read from stdin. Right now it collects all URLs until stdin closes and than 
starts downloading.

I wrote a C library for the nextgen Wget (start to move the code to wget this 
autumn) with that you can also do the extraction part. There are small C 
examples that you might extend to work recursive. It works with CSS and HTML.

https://github.com/rockdaboot/mget/tree/master/examples

Regards, Tim
 

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 2/2] openssl: Read cert from private key file if only private key file is given

2015-04-27 Thread Tim Rühsen
Thanks Rohit !

I pushed your patches with a slighty amended commit message.


Regards, Tim

Am Freitag, 24. April 2015, 15:48:30 schrieb Rohit Mathulla:
 * src/openssl.c (ssl_init): Assign opt.cert_{file, type} from
 opt.private_key(_type) ---
 
 While making the previous double free patch, I saw that openssl doesn't have
 a check for the case where --private-key is given but not --certificate. I
 don't know if there is a specific reason for openssl not having it while
 gnutls does but I'm sending this as a seperate patch just in case.
 
 Thanks,
 Rohit
 
  src/openssl.c | 7 +++
  1 file changed, 7 insertions(+)
 
 diff --git a/src/openssl.c b/src/openssl.c
 index b6cdb8d..3ac0f44 100644
 --- a/src/openssl.c
 +++ b/src/openssl.c
 @@ -296,6 +296,13 @@ ssl_init (void)
opt.private_key_type = opt.cert_type;
  }
 
 +  /* Use cert from private key file unless otherwise specified. */
 +  if (opt.private_key  !opt.cert_file)
 +{
 +  opt.cert_file = xstrdup (opt.private_key);
 +  opt.cert_type = opt.private_key_type;
 +}
 +
if (opt.cert_file)
  if (SSL_CTX_use_certificate_file (ssl_ctx, opt.cert_file,
key_type_to_ssl_type (opt.cert_type))


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] gethttp cleanup

2015-04-19 Thread Tim Rühsen
Am Donnerstag, 9. April 2015, 23:41:22 schrieb Hubert Tarasiuk:
 W dniu 01.04.2015 o 17:01, Giuseppe Scrivano pisze:
  Hubert Tarasiuk hubert.taras...@gmail.com writes:
  When these two issues are dealt with, a common cleanup code for
  `gethttp` will be easily possible for variables:
  - req
  - resp
  - head
  - message
  - type
  
  I went ahead and pushed your patches!
 
 Hello developers,
 
 I have prepared common cleanup code for the following variables in
 http.c (gethttp):
 req, resp, head, message, type
 
 I added a single exit point to that function.
 
 It should be a good starting point for further refactoring inside that
 function.
 
 Please have a look at the patch and let me know what do you think about it.
 
 I was thinking about a simple macro like:
  #define GETHTTP_CLEAN_RETURN(x) do\
  
{\

  retval = (x);\
  goto cleanup;\

}\
 
 Then instead of writing:
  retval = XYZ;
  goto cleanup;
 
 we could simply write:
  GETHTTP_CLEAN_RETURN (XYZ);
 
 However I am not sure if it will not obfuscate the code, or if it is
 good style/convention.

It depends on whom you ask :-)
I think it is good style to have one 'cleanup/exit' code. I find it easier to 
read, leaks are more obvious.

If no one complains (well, no one did so far), I'll push the code tomorrow.

Thanks for your works, Hubert.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] WGET 1.15

2015-04-12 Thread Tim Rühsen
Am Sonntag, 12. April 2015, 07:15:49 schrieb Bernard Veasey:
 Hi Tim
 
 Now I've got WGET 1.15.1 working on RISC OS fetching a file,  I have
 come across another problem.  When WGET is called, even if it's only
 WGET --help
 it stops Mplayer playing Internet Radio.
 If I revert to an older version of WGET (1.11.2), the problem does not
 happen.  How can I get WGET 1.15.1 to work without this problem
 please?

Hi Bernard,

please write to the mailing list (bug-wget@gnu.org) and not privately to me.

Mplayer may stop playing if Wget takes all the available bandwidth.
Try out the --limit-rate option.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Memory leak in idn_encode; Valgrind suppression file

2015-04-11 Thread Tim Rühsen
Am Donnerstag, 9. April 2015, 23:14:22 schrieb Hubert Tarasiuk:
 W dniu 08.04.2015 o 22:21, Tim Rühsen pisze:
  Could you use $srcdir instead Cwd ? I didn't test it, but it could fail
  when using DISTCHECK_CONFIGURE_FLAGS=VALGRIND_TESTS=1 make distcheck.
  It least we should use the same path mechanism as with e.g. 'certs' in
  Test-proxied-https-auth.px.
 It did fail and I have fixed the problem. (Cwd it still used, because
 $srcdir contains a relative path, and current directory changes later.)
 It now passes
 $ DISTCHECK_CONFIGURE_FLAGS=VALGRIND_TESTS=1 make distcheck
 (After applying both memory leak patch and Valgrind suppression file patch.)
 
 Thank you in advance for the review.
 
 Best regards,
 Hubert

The patch has been pushed.

Thank you.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Memory leak in idn_encode; Valgrind suppression file

2015-04-10 Thread Tim Rühsen
Am Mittwoch, 8. April 2015, 21:09:28 schrieb Ángel González:
 On 08/04/15 19:08, Hubert Tarasiuk wrote:
  W dniu 07.04.2015 o 00:11, Ángel González pisze:
  On 06/04/15 22:15, Hubert Tarasiuk wrote:
  We should probably also fix this comment:
   /* sXXXav : free new when needed ! */
  
  As it presumably mentions the problem that we are going to repair.
  
  I thought it refered to idna_to_ascii_8z sometimes allocating memory on
  error (that's
  why I didn't touch it), but it may as well refer to this.
  
  That would make more sense based on the location of the comment.
  I looked at the current source code for idna_to_ascii_8z (
  http://www.gnu.org/software/libidn/doxygen/idna_8c_source.html#l00572 )
  and if I am not mistaken, it should not happen that the *output will be
  set to allocated memory, and the function will fail.
  (I did not find anything concerning this problem in the manual of this
  function.)
  Maybe the comment is out of date? (Commited in July 2008.) Is the author
  of it on this list?
 
 Saint Xavier was a GSoC contributor in 2008. He *was* subscribed in
 January 2009
 (when he sent his last post), but I doubt he is still reading this
 mailing list.
 
 Maybe Micah knows.
 
 9a2ea39 added that comment at the same time as the remote_to_utf8 call.
 So it may be refering to its allocation, or simply mean he noticed a
 idna_to_ascii_8z
 when preparing that code. :/

Hmmm, if the bug in idna_to_ascii_8z() has been fixed (as Hubert says), we 
should simply remove the comment (as Hubert suggests).

Hubert, could you amend the patch ? Preferably with Ángel's variable renaming.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Memory leak in idn_encode; Valgrind suppression file

2015-04-07 Thread Tim Rühsen
Am Montag, 6. April 2015, 22:26:27 schrieb Tim Rühsen:
 Hi Hubert,
 
 Am Montag, 6. April 2015, 12:57:18 schrieb Hubert Tarasiuk:
  Hello devs,
  
  Also, IMHO it is worth to add a suppression file for valgrind tests in
  wget. Otherwise, the tests do not pass. (Apart from the bug mentioned
  above, there is a Valgrind's false positive at `idna_to_ascii_4z` in
  `libidn.so`.) And since the first part (`tests`) fails, `make check`
  does not even make it to the second part (`testenv`). (Which is probably
  another bug, not a feature :D.)
  The problem is, that I do not see a simple way to add it in one place;
  as Valgrind invocation is handled separately in both `testenv` and
  `tests`. However, the problem appears only in `tests` currently,
  therefore maybe it could be added just there.
  I am attaching my workaround for the problem (Valgrind's suppression
  file and a patch to WgetTests.pm) - it could be probably done in a more
  elegant way.
 
 Good fix (also the idna issue that Ángel answered to) !
 
 Valgrind suppressions are a bit compiler/architecture/distribution
 dependent. Maybe you could you add a comment into the suppression file with
 these infos. As a quick reference and explanation.
 
 And yes, if you find a proper way to execute the 'testenv' part no matter if
 'tests' fails or not, a patch would be welcome. It always annoyed me, but
 not enough to put work into it :-(

Sorry, one point I missed: Please put the suppression file into EXTRA_DIST 
variable in tests/Makefile.am. Else it won't go into the tarball (make dist).

 
 Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Memory leak in idn_encode; Valgrind suppression file

2015-04-06 Thread Tim Rühsen
Hi Hubert,

Am Montag, 6. April 2015, 12:57:18 schrieb Hubert Tarasiuk:
 Hello devs,
 
 Also, IMHO it is worth to add a suppression file for valgrind tests in
 wget. Otherwise, the tests do not pass. (Apart from the bug mentioned
 above, there is a Valgrind's false positive at `idna_to_ascii_4z` in
 `libidn.so`.) And since the first part (`tests`) fails, `make check`
 does not even make it to the second part (`testenv`). (Which is probably
 another bug, not a feature :D.)
 The problem is, that I do not see a simple way to add it in one place;
 as Valgrind invocation is handled separately in both `testenv` and
 `tests`. However, the problem appears only in `tests` currently,
 therefore maybe it could be added just there.
 I am attaching my workaround for the problem (Valgrind's suppression
 file and a patch to WgetTests.pm) - it could be probably done in a more
 elegant way.

Good fix (also the idna issue that Ángel answered to) !

Valgrind suppressions are a bit compiler/architecture/distribution dependent. 
Maybe you could you add a comment into the suppression file with these infos. 
As a quick reference and explanation.

And yes, if you find a proper way to execute the 'testenv' part no matter if 
'tests' fails or not, a patch would be welcome. It always annoyed me, but not 
enough to put work into it :-(

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Redirect containing %2B behaves differently depending on locale

2015-04-03 Thread Tim Rühsen
Hi Ander,

Am Freitag, 3. April 2015, 12:26:09 schrieb Ander Juaristi:
 On 03/13/2015 11:48 PM, Adam Sampson wrote:
  Hi,
  
  I've just found a case where wget 1.16.3 responds to a 302 redirect
  differently depending on whether it's in an ASCII or UTF-8 locale.
  
  This works:
  LC_ALL=en_GB.UTF-8 wget
  https://bitbucket.org/pypy/pypy/downloads/pypy-2.5.0-src.tar.bz2
  
  This doesn't work:
  LC_ALL=C wget
  https://bitbucket.org/pypy/pypy/downloads/pypy-2.5.0-src.tar.bz2
  
  I've attached logs with -d showing what's actually going on. The
  
  initial request gives a 302 response with a Location: that contains:
 tar.bz2?Signature=up6%2BtTpSF...
  
  In the UTF-8 locale, wget correctly redirects to that location.
  
  In the ASCII locale, wget -d print a converted: '...' - '...' line
  
  (from iri.c's do_conversion), then redirects to:
 tar.bz2?Signature=up6+tTpSF...
  
  (If you try it yourself you'll get a slightly different URL, but at
  least for me it usually contains %2B somewhere.)
  
  This appears to be because do_conversion calls url_unescape on the
  input string it's given -- even though that input string is a _const_
  char * in the code that calls it (main - retrieve_url - url_parse -
  remote_to_utf8 - do_conversion). It's not immediately obvious to me
  whether that's intentional or not; at the very least, it's a surprising
  bit of behaviour.
 
 That call to url_unescape() is necessary because iconv() needs the multibyte
 characters with no encoding. My first approach, by the way, was to remove
 that call, but that caused Test-iri-percent.px to fail, which is pretty
 clear.
 
 The issue seems to be at the call to reencode_escapes(), just after
 remote_to_utf8() returns. The problem here is that %2B resolves to +
 (literal). And that character is equal to the reserved character +, and
 reencode_escapes() treats it as a reserved characters and leaves it as-is.
 The same happens with other characters, such as = (%3D).
 
 What I propose is to tag the characters that have been decoded, in
 url_unescape(), and then in reencode_escapes(), verify if they coincide
 with reserved characters as well.
 
 What do you guys think?

Without looking at the code right now and from what you describe above, your 
proposal sounds like a good idea. This problem pops up again and again. If you 
solve the issue, some people will love you :-)

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?

2015-03-31 Thread Tim Rühsen
Hi Steven,

Am Dienstag, 31. März 2015, 18:11:58 schrieb Stephen Wells:
 Dear all - I am currently trying to use wget to obtain mp3 files from the
 Google Translate TTS system. In principle this can be done using:
 
 wget -U Mozilla -O ${string}.mp3 
 http://translate.google.com/translate_tts?tl=TLq=${string};
 
 where TL is a twoletter language code (en,fr,de and so on).
 
 However I am meeting a serious error when I try to send Russian strings
 (tl=ru) in Cyrillic characters. I'm working in a UTF-8 environment (under
 Cygwin) and the file system will display the cyrillic strings no problem.
 If I provide a command like this:
 
 http://translate.google.com/translate_tts?tl=ruq=мазать
 
 wget incorrectly processes the Cyrillic characters _before_ sending the
 http request, so what it actually requests is:
 
 http://translate.google.com/translate_tts?tl=ruq=%D0%BC%D0%B0%D0%B7%D0%B0%D
 1%82%D1%8C

This seems to be the correct behavior of a web client.
The URL in the GET request is transmitted UTF-8 encoded and percent escaping 
is performed for chars 127 (not mentioning control chars here).

 This of course produces a string of gibberish in the resulting mp3 file!

This is something different. If you are talking about the file name, well 
there is --restrict-file-names=nocontrol. Did you give it a try ?

 Is there any way to make wget actually send the string it is given, instead
 of mangling it on the way out? This is really blocking me.

From what you write, I am unsure if you are talking about the resulting file 
name or about HTTP URL encoding in a GET request.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Student] Hi

2015-03-21 Thread Tim Rühsen
Am Mittwoch, 18. März 2015, 04:31:11 schrieb Gowtham Ashok:
 Hello,
 
Hi Gowtham,

nice to see you being interested in Wget development.

 I'm Gowtham Ashok, a senior undergraduate from Madras Institute of
 Technology, India. I'm a KDE contributor and have mentored Google Code-in
 2014 for them.
 I use Wget regularly and would like to contribute to it. I thought doing a
 GSoC project would be a good starting point. Am I too late?

Not *too* late, but late.
You would have to work hard to get your application ready in time.


 I have compiled and tested Wget from source. I have seen the page
 https://github.com/darnir/wget/wiki/GSoC-2015, but most of the bugs listed
 there seems to be taken by a student.
 I'd like to know if there are any simple open bugs to fix.

Besides the Savannah bugs there are also several outstanding Coverity bugs.

Go to https://scan.coverity.com, 'Projects Using Scan', search for wget and 
click on 'Add me to project'. Darshit will (hopefully) let you in and than you 
can browse and work on the list of defects. Skip the ones you are unsure of.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] [PATCH] two Coverity bugs fixed

2015-03-16 Thread Tim Rühsen
I would like to push these two patches tomorrow if nobody complains.

Tim
From e49a93abda978c42771b7055d6a3134cf052952a Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Mon, 16 Mar 2015 21:12:22 +0100
Subject: [PATCH 1/2] src/ftp.c: make sure warc_tmp becomes closed before
 return

Reported-by: Coverity bug #1188044
---
 src/ftp.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/ftp.c b/src/ftp.c
index 5a1d38b..1902242 100644
--- a/src/ftp.c
+++ b/src/ftp.c
@@ -1696,7 +1696,7 @@ ftp_loop_internal (struct url *u, struct fileinfo *f, ccon *con, char **local_fi
 case UNLINKERR: case WARC_TMP_FWRITEERR:
   /* Fatal errors, give up.  */
   if (warc_tmp != NULL)
-fclose (warc_tmp);
+  fclose (warc_tmp);
   return err;
 case CONSOCKERR: case CONERROR: case FTPSRVERR: case FTPRERR:
 case WRITEFAILED: case FTPUNKNOWNTYPE: case FTPSYSERR:
@@ -1771,10 +1771,12 @@ ftp_loop_internal (struct url *u, struct fileinfo *f, ccon *con, char **local_fi

   warc_res = warc_write_resource_record (NULL, u-url, NULL, NULL,
   warc_ip, NULL, warc_tmp, -1);
+
   if (! warc_res)
 return WARC_ERR;

   /* warc_write_resource_record has also closed warc_tmp. */
+  warc_tmp = NULL;
 }

   if (con-cmd  DO_LIST)
@@ -1821,6 +1823,9 @@ Removing file due to --delete-after in ftp_loop_internal():\n));
   if (local_file)
 *local_file = xstrdup (locf);

+  if (warc_tmp != NULL)
+fclose (warc_tmp);
+
   return RETROK;
 } while (!opt.ntry || (count  opt.ntry));

@@ -1829,6 +1834,10 @@ Removing file due to --delete-after in ftp_loop_internal():\n));
   fd_close (con-csock);
   con-csock = -1;
 }
+
+  if (warc_tmp != NULL)
+fclose (warc_tmp);
+
   return TRYLIMEXC;
 }

--
2.1.4

From 1fbc58b2e0b5be09c91532e9d1465fe54cdcb057 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Mon, 16 Mar 2015 21:28:25 +0100
Subject: [PATCH 2/2] src/http.c: fix error return of
 digest_authentication_encode()

Reported-by: Coverity bug #1188036
---
 src/http.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/http.c b/src/http.c
index b7020ef..a4c30d5 100644
--- a/src/http.c
+++ b/src/http.c
@@ -3859,15 +3859,17 @@ digest_authentication_encode (const char *au, const char *user,

   if (!realm || !nonce || !user || !passwd || !path || !method || !qop)
 {
+  if (!qop)
+*auth_err = UNKNOWNATTR;
+  else
+*auth_err = ATTRMISSING;
+
   xfree (realm);
   xfree (opaque);
   xfree (nonce);
   xfree (qop);
   xfree (algorithm);
-  if (!qop)
-*auth_err = UNKNOWNATTR;
-  else
-*auth_err = ATTRMISSING;
+
   return NULL;
 }

--
2.1.4



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Added wget http test for 503 Service unavailable

2015-03-14 Thread Tim Rühsen
Hi Satyam,

good work, thanks for your coitribution !

Just two little things are missing for a complete patch:
- please add the test to testenv/Makefile.am (so it is executed with 'make 
check')
- please chmod a+x testenv/Test-503.py (so one can execute the test stand-
alone if needed)


And a general question to everybody about temporary failures and Wget. 
Shouldn't wget try again after a pause in case of temp failures ? 503 is such 
a temp failure.
I guess that would be in the 'spirit' of Wget !?
[Satyam, that does not effect your patch right now]

Tim

Am Samstag, 14. März 2015, 03:18:07 schrieb Satyam Zode:
 Hello everyone !
 I am working wget FTP test suite . I have written a Test for 503
 Service unavailable feature of wget (For HTTP server).
 Please verify it and correct me if I am making any mistake.
 All suggestions are welcome :-) . I am also working on other tests too .
 
 
 Here is Test-503 patch .
 
 From ce32f9ee17fcd9544a34cf9e3656ee7e10ea289d Mon Sep 17 00:00:00 2001
 From: Satyam Zode satyamz...@gmail.com
 Date: Sat, 14 Mar 2015 02:46:26 +0530
 Subject: [PATCH] Added wget http  test for 503 Service unavailable
 
 ---
  testenv/Test-503.py | 60
 + 1 file changed, 60
 insertions(+)
  create mode 100644 testenv/Test-503.py
 
 diff --git a/testenv/Test-503.py b/testenv/Test-503.py
 new file mode 100644
 index 000..7f1c3c8
 --- /dev/null
 +++ b/testenv/Test-503.py
 @@ -0,0 +1,60 @@
 +#!/usr/bin/env python3
 +from sys import exit
 +from test.http_test import HTTPTest
 +from misc.wget_file import WgetFile
 +
 +
 +This test ensures that Wget handles a 503 Service Unavailable response
 +correctly.
 +
 +TEST_NAME = 503 Service Unavailable
 +# File Definitions
 ### +File1 = All happy
 families are alike;
 +Each unhappy family is unhappy in its own way
 +File2 = Anyone for chocochip cookies?
 +
 +File1_rules = {
 +Response  : 503
 +}
 +
 +A_File = WgetFile (File1, File1, rules=File1_rules)
 +B_File = WgetFile (File2, File2)
 +
 +Request_List = [
 +[
 +GET /File1,
 +GET /File2,
 +]
 +]
 +
 +
 +WGET_OPTIONS = --tries=2
 +WGET_URLS = [[File1, File2]]
 +
 +Files = [[A_File, B_File]]
 +
 +ExpectedReturnCode = 8
 +ExpectedDownloadedFiles = [B_File]
 +
 + Pre and Post Test Hooks
 # +pre_test = {
 +ServerFiles   : Files
 +}
 +test_options = {
 +WgetCommands  : WGET_OPTIONS,
 +Urls  : WGET_URLS
 +}
 +post_test = {
 +ExpectedFiles : ExpectedDownloadedFiles,
 +ExpectedRetcode   : ExpectedReturnCode,
 +FilesCrawled  : Request_List
 +}
 +
 +err = HTTPTest (
 +name=TEST_NAME,
 +pre_hook=pre_test,
 +test_params=test_options,
 +post_hook=post_test
 +).begin ()
 +
 +exit (err)


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] --connect-timeout doesn't work on Windows

2015-03-14 Thread Tim Rühsen
Am Samstag, 14. März 2015, 14:27:14 schrieb Jernej Simončič:
 It appears that --connect-timeout doesn't work on recent versions of
 wget when running on Windows (compiled with mingw). I have confirmed
 this with my own builds of wget (available at
 https://eternallybored.org/misc/wget/, both 1.16.3 and 1.13), and
 with the builds available here:
 http://opensourcepack.blogspot.com/2010/05/wget-112-for-windows.html.
 
 This is pretty trivial to test - I used
   wget --connect-timeout=1 http://192.0.2.1:12345/
 ...and wget just hung there (I left it for a few minutes).
 
 The old gnuwin32 build (1.11.4) doesn't have this problem.

There has been commit 4a685764a845d5c74a76fcb49a4671f055b8d5f4 (15.5.2011) 
between release 1.12 and 1.13. It sets the socket to blocking after calling 
gnulib's select().

Maybe someone with Windows access can have a look !?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] wget-1.16.1.46-3d8e7 missing hint in the tarball

2015-02-18 Thread Tim Rühsen
Hi Darshit,

Am Mittwoch, 18. Februar 2015, 22:39:18 schrieb Darshit Shah:
 Readme.checkout is a file that is only available in the git
 repository. I guess this is a side effect of having
 gitlog-to-changelog generate out changelogs. A reference to this file
 has crept into the tarball.
 
 libio-socket-ssl-perl is required only for running the test suite and
 hence it not included in the README for the tarball.

AFAICS, the test suite is included in the tarball, isn't it ?

So there is no reason to not run the test suite after compiling.
To do so, there should be a list of needed software packages (anyway needed 
for compiling).

What about renaming README.checkout to README + small amendments to the text 
to distinguish between compiling from tarball and git ? 

What do you think ?

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [bug #44090] The function strcasestr is strictly non-standard and non-portable thus wget is broken

2015-02-10 Thread Tim Rühsen
Hi Dennis,

I put this discussion onto the wget mailing list.

There is Dagobert Michelsen, a Solaris expert.
He drives a Solaris build farm. I git-cloned wget on that farm and built it 
from scratch without problems. (I did not try the official tarball.)

Of course I don't know in detail how the systems are configured.
If he has time, I am sure he could give you some hints and answers some of 
your questions.

I guess you should subscribe to the list (if not already done) to not miss an 
answer. see https://lists.gnu.org/mailman/listinfo/bug-wget

@Dago: the issue started here: https://savannah.gnu.org/bugs/?44090 
Could you give some advice or hints on this issue ?


My compilation says:
rockdaboot@unstable10x [global]:~/wget  src/wget --version
GNU Wget 1.16.1.40-8705-dirty built on solaris2.10.

+digest +https +ipv6 +iri +large-file -nls +ntlm +opie +psl +ssl/gnutls 

Wgetrc: 
/usr/local/etc/wgetrc (system)  
 
Compile:
 
gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC=/usr/local/etc/wgetrc 
 
-DLOCALEDIR=/usr/local/share/locale -I. -I../lib -I../lib 
 
-D_REENTRANT -I/opt/csw/include -I/opt/csw/include  
 
-I/opt/csw/include/p11-kit-1 -DHAVE_LIBGNUTLS -I/opt/csw/include
 
-I/opt/csw/include -DNDEBUG 
 
Link:   
 
gcc -I/opt/csw/include -I/opt/csw/include   
 
-I/opt/csw/include/p11-kit-1 -DHAVE_LIBGNUTLS -I/opt/csw/include 
-I/opt/csw/include -DNDEBUG -liconv -L/opt/csw/lib -lpcre -luuid 
-lnettle -L/opt/csw/lib -lgnutls -L/opt/csw/lib -lz -L/opt/csw/lib 
-lpsl -lsocket -lnsl -lrt -lidn -lrt ftp-opie.o gnutls.o 
http-ntlm.o ../lib/libgnu.a 

Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
http://www.gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic hnik...@xemacs.org.
Please send bug reports and questions to bug-wget@gnu.org.

rockdaboot@unstable10x [global]:~/wget  file src/wget
src/wget: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically 
linked (uses shared libs), not stripped

rockdaboot@unstable10x [global]:~/wget  ldd src/wget
libiconv.so.2 = /opt/csw/lib/libiconv.so.2
libpcre.so.1 =  /opt/csw/lib/libpcre.so.1
libuuid.so.1 =  /opt/csw/lib/libuuid.so.1
libnettle.so.4 =/opt/csw/lib/libnettle.so.4
libgnutls.so.28 =   /opt/csw/lib/libgnutls.so.28
libz.so.1 = /opt/csw/lib/libz.so.1
libpsl.so.0 =   /opt/csw/lib/libpsl.so.0
libsocket.so.1 =/lib/libsocket.so.1
libnsl.so.1 =   /lib/libnsl.so.1
librt.so.1 =/lib/librt.so.1
libidn.so.11 =  /opt/csw/lib/libidn.so.11
libc.so.1 = /lib/libc.so.1
libpthread.so.1 =   /lib/libpthread.so.1
libp11-kit.so.0 =   /opt/csw/lib/libp11-kit.so.0
libhogweed.so.2 =   /opt/csw/lib/libhogweed.so.2
libgmp.so.10 =  /opt/csw/lib/libgmp.so.10
libintl.so.8 =  /opt/csw/lib/libintl.so.8
libgcc_s.so.1 = /opt/csw/lib/libgcc_s.so.1
libicuuc.so.54 =/opt/csw/lib/i386/libicuuc.so.54
libmp.so.2 =/lib/libmp.so.2
libmd.so.1 =/lib/libmd.so.1
libscf.so.1 =   /lib/libscf.so.1
libaio.so.1 =   /lib/libaio.so.1
libm.so.1 = /lib/libm.so.1
libdl.so.1 =/lib/libdl.so.1
libCrun.so.1 =  /usr/lib/libCrun.so.1
libicudata.so.54 =  /opt/csw/lib/libicudata.so.54
libm.so.2 = /lib/libm.so.2
libdoor.so.1 =  /lib/libdoor.so.1
libuutil.so.1 = /lib/libuutil.so.1
libgen.so.1 =   /lib/libgen.so.1

Tim


Am Dienstag, 10. Februar 2015, 11:29:43 schrieb dcla...@blastwave.org:
  On February 10, 2015 at 10:00 AM Tim Ruehsen invalid.nore...@gnu.org
  wrote:
  
  
  Follow-up Comment #2, bug #44090 (project wget):
  
  strcasecmp is not standard, right.
  

Re: [Bug-wget] [PATCH] Fix for #43785

2015-02-07 Thread Tim Rühsen
Am Samstag, 7. Februar 2015, 10:14:33 schrieb Giuseppe Scrivano:
 Tim Ruehsen tim.rueh...@gmx.de writes:
  Fix #43785 for another Solaris issue.
  
  I would like to see this in the next bugfix release.
  
  Do you agree ?
  
  Tim
  
  From 518ed8d07c0a6e441e0a919ba1967d89e0061898 Mon Sep 17 00:00:00 2001
  From: =?UTF-8?q?Tim=20R=C3=BChsen?= tim.rueh...@gmx.de
  Date: Thu, 5 Feb 2015 16:05:24 +0100
  Subject: [PATCH] src/wget.h: Fix libintl.h / gettext clash for Solaris
  
  In case --disable-nls is given, we have to include libintl.h before
  out gettext defines because libintl.h will be implicitly included by
  locale.h. And in this case our defines will conflict with libintl.h
  function prototypes.
  ---
  
   src/wget.h | 11 +++
   1 file changed, 7 insertions(+), 4 deletions(-)
  
  diff --git a/src/wget.h b/src/wget.h
  index cddacdc..0b2381d 100644
  --- a/src/wget.h
  +++ b/src/wget.h
  @@ -59,11 +59,14 @@ as that of the covered work.  */
  
   /* `gettext (FOO)' is long to write, so we use `_(FOO)'.  If NLS is
   
  unavailable, _(STRING) simply returns STRING.  */
   
   #if ENABLE_NLS
  
  -#  include libintl.h
  -#  define _(STRING) gettext(STRING)
  +# include libintl.h
  +# define _(STRING) gettext(STRING)
  
   #else
  
  -#  define _(STRING) STRING
  -#  define ngettext(STRING1,STRING2,N) STRING2
  +# ifdef solaris
  +#  include libintl.h
  +# endif
 
 I would prefer that we don't have OS dependent ifdefs.  What about
 redefining ngettext as following when !ENABLE_NLS?  Would this be
 enough?
 
 # define ngettext(Msgid1, Msgid2, N) \
 ((N) == 1 \
  ? ((void) (Msgid2), (const char *) (Msgid1)) \
 
  : ((void) (Msgid1), (const char *) (Msgid2)))

This does not work (Kiyoshi wrote a private Mail to me).

 Or can't we just #include libintl.h on all platforms?

I wouldn't expect this to work everywhere.

Another idea: how often do we use ngettext ? AFAIR, only 1 or 2 times in the 
code. So why not remove the #define and amend those two occurrences ?

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Use of 'ssl_st'

2015-02-05 Thread Tim Rühsen
Am Donnerstag, 5. Februar 2015, 23:56:43 schrieb Darshit Shah:
 Hi Tim,
 
 I'm getting a page does not exist for the link you provided. Can you please
 check it?

I just checked with Iceweasel/Firefox, Konqueror and Chromium. 
It works in all cases.

Could you try it again ?

But I can copy the Text (it is not much, and maybe not related to the issue 
Gisle found - I read the curl mailing list archive.)

OpenSSL made the SSL_SESSION struct private and broke the API between 1.0.1 
and 1.0.2. Fix: use (another) undocumented function... :-/

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Avoid filename indexes

2015-01-31 Thread Tim Rühsen
Am Samstag, 31. Januar 2015, 17:58:54 schrieb Alexander Kurakin:
 Good day!
 
 Suppose I use --recursive mode.
 Suppose there are 'www.example.com/about' and 'www.example.com/about/more'
 pages.
 
 Then I have 'about.1.html' and 'about/more.html' files.
 Can I avoid '.1' suffix by wget options?

Hi Alexander,

what would you expect ?

Basically, Wget saves the downloaded content to the filesystem.
When it saved 'about/more', how can it save a new file 'about' - this name 
already exists as directory !? Same goes when the download order is revers.

(BTW, Are you using -E/--adjust-extension (or the obsolete option --html-
extension) ?)

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Force --show-progress to print progress bar to stderr

2015-01-20 Thread Tim Rühsen
Am Dienstag, 20. Januar 2015, 22:36:12 schrieb Darshit Shah:
 So, I tried pushing the changes, but I'm unable to do so. git push
 gives me the following error:
 
 Counting objects: 11, done.
 Delta compression using up to 4 threads.
 Compressing objects: 100% (11/11), done.
 Writing objects: 100% (11/11), 1.62 KiB | 0 bytes/s, done.
 Total 11 (delta 10), reused 0 (delta 0)
 error: unpack failed: unpack-objects abnormal exit
 To git://git.savannah.gnu.org/wget.git
 ! [remote rejected] master - master (n/a (unpacker error))
 error: failed to push some refs to 'git://git.savannah.gnu.org/wget.git'
 
 
 I even tried a fresh clone of Wget, but I keep getting the same error
 which leads me to believe that the issue is in the savannah servers.
 I've opened a support ticket, but if someone is able to push to the
 repository, please push this commit.

I pushed it, no problems here.

Tim

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] wget doesn't handle 5-digit port numbers in EPSV reponses

2015-01-04 Thread Tim Rühsen
Am Sonntag, 4. Januar 2015, 16:19:01 schrieb Adam Sampson:
 Dear wget authors,
 
 When using wget --passive-ftp against IPv6 FTP servers, I occasionally
 get the following error:
 
   == EPSV 
   Cannot parse PASV response.
 
 I finally found an FTP server that consistently had this problem today
 (stunnel.mirt.net), and strace showed that the response in question was:
 
   229 Entering Extended Passive Mode (|||49854|).
 
 This is a perfectly valid response. wget is getting confused because of
 an off-by-one error in the code that parses the port number in ftp_epsv.
 When the port number is 5 digits long, i will be 5 at the end of the
 loop, so the test for an invalid port number length should check for it
 being *greater than* 5.
 
 Here's the trivial fix:
 
 * ftp-basic.c (ftp_epsv): Accept 5-digit port numbers in EPSV responses.
 
 diff -x config.log -x config.status -ru tmp/wget-1.16.1/src/ftp-basic.c
 work/wget-1.16.1/src/ftp-basic.c ---
 tmp/wget-1.16.1/src/ftp-basic.c   2014-12-02 07:49:37.0 + +++
 work/wget-1.16.1/src/ftp-basic.c  2015-01-04 16:06:02.28100 + @@
 -788,7 +788,7 @@
for (tport = 0, i = 0; i  5  c_isdigit (*s); i++, s++)
tport = (*s - '0') + 10 * tport;
 
 -  if (i = 5)
 +  if (i  5)
  {
xfree (respline);
return FTPINVPASV;
 
 Thanks very much (and happy new year!),

Happy New Year, Adam.

The loop condition is i  5, so when i becomes 5 the loop stops.
So how can i be  5 here ?

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] wget doesn't handle 5-digit port numbers in EPSV reponses

2015-01-04 Thread Tim Rühsen
Am Sonntag, 4. Januar 2015, 18:57:23 schrieb Tim Rühsen:
 Am Sonntag, 4. Januar 2015, 16:19:01 schrieb Adam Sampson:
  Dear wget authors,
  
  When using wget --passive-ftp against IPv6 FTP servers, I occasionally
  
  get the following error:
== EPSV 
Cannot parse PASV response.
  
  I finally found an FTP server that consistently had this problem today
  
  (stunnel.mirt.net), and strace showed that the response in question was:
229 Entering Extended Passive Mode (|||49854|).
  
  This is a perfectly valid response. wget is getting confused because of
  an off-by-one error in the code that parses the port number in ftp_epsv.
  When the port number is 5 digits long, i will be 5 at the end of the
  loop, so the test for an invalid port number length should check for it
  being *greater than* 5.
  
  Here's the trivial fix:
  
  * ftp-basic.c (ftp_epsv): Accept 5-digit port numbers in EPSV responses.
  
  diff -x config.log -x config.status -ru tmp/wget-1.16.1/src/ftp-basic.c
  work/wget-1.16.1/src/ftp-basic.c ---
  tmp/wget-1.16.1/src/ftp-basic.c 2014-12-02 07:49:37.0 + +++
  work/wget-1.16.1/src/ftp-basic.c2015-01-04 16:06:02.28100 + 
@@
  -788,7 +788,7 @@
  
 for (tport = 0, i = 0; i  5  c_isdigit (*s); i++, s++)
 
 tport = (*s - '0') + 10 * tport;
  
  -  if (i = 5)
  +  if (i  5)
  
   {
   
 xfree (respline);
 return FTPINVPASV;
  
  Thanks very much (and happy new year!),
 
 Happy New Year, Adam.
 
 The loop condition is i  5, so when i becomes 5 the loop stops.
 So how can i be  5 here ?

Hehe, I was a bit too fast ;-)

i == 5 *is* a valid value after the loop, so you are right.

But since i never becomes  5, the checks does not make sense and we should 
remove that check. Or change it to what is was meant to be (e.g. i==5  
c_isdigit(*s)), I guess.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] wget doesn't handle 5-digit port numbers in EPSV reponses

2015-01-04 Thread Tim Rühsen
Am Sonntag, 4. Januar 2015, 18:39:05 schrieb Adam Sampson:
 On Sun, Jan 04, 2015 at 07:06:05PM +0100, Tim Rühsen wrote:
  But since i never becomes  5, the checks does not make sense and we
  should remove that check. Or change it to what is was meant to be
  (e.g. i==5  c_isdigit(*s)), I guess.
 
 There's a check for whether the next character is the delimiter
 immediately afterwards, so just removing the check entirely should be
 fine.

Thanks, I pushed the appropriate patch.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] wget 1.16.1 on Cygwin mangles %2B in redirects

2015-01-03 Thread Tim Rühsen
A short test here on Linux can reproduce it:

LC_ALL=C ../src/wget -d --restrict-file-names=windows 
http://www.example.com/bla?Signature=xxx%2Bxxx;
...
URI encoding = 'ANSI_X3.4-1968'
converted 'http://www.example.com/bla?Signature=xxx%2Bxxx' (ANSI_X3.4-1968) - 
'http://www.example.com/bla?Signature=xxx+xxx' (UTF-8)
...
GET /bla?Signature=xxx+xxx HTTP/1.1

But after a 4xx you can also see:
[IRI fallbacking to non-utf8 for 
'http://www.example.com/bla?Signature=xxx%2Bxxx'
...
GET /bla?Signature=xxx%2Bxxx HTTP/1.1
...

With this second try (also seen in your example output) the file is correctly 
downloaded and saved. Looks like an expected (or wanted) behavior of Wget, but 
can't say any more about it.

Tim


Am Samstag, 3. Januar 2015, 02:28:22 schrieb Benjamin Gilbert:
 Hi,
 
 On Cygwin, wget 1.16.1 (and current Git master) occasionally fails to fetch
 release artifacts hosted on GitHub.  Example URL:
 
 https://github.com/openslide/openslide/releases/download/v3.4.0/openslide-3.
 4.0.tar.xz
 
 GitHub redirects the request to an Amazon S3 signed URL, and that signature
 sometimes includes a percent-encoded + character.  In this case, wget
 improperly decodes the %2B back to + before processing the redirect,
 causing Amazon to return 403.  Wget then repeatedly retries the request to
 GitHub until either 1) the redirect URL does not include a +, or 2) 20
 failed redirects have occurred.
 
 This does not happen if IRI is compiled out, does not happen if wget is run
 from the Cygwin shell, and does not happen on Linux.  It does occur if wget
 is run from the Windows command shell (or, as I originally discovered it,
 from a Buildbot process running under cygrunsrv).
 
 
 wget version info:
 
 C:\cygwin\home\user\wgetsrc\wget.exe -V
 GNU Wget 1.16.1.36-01d5 built on cygwin.
 
 +digest +https +ipv6 +iri +large-file +nls +ntlm +opie -psl +ssl/gnutls
 
 Wgetrc:
 /usr/local/etc/wgetrc (system)
 Locale:
 /usr/local/share/locale
 Compile:
 gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC=/usr/local/etc/wgetrc
 -DLOCALEDIR=/usr/local/share/locale -I. -I../lib -I../lib
 -Iyes/include -I/usr/include/p11-kit-1 -DHAVE_LIBGNUTLS -DNDEBUG
 Link:
 gcc -I/usr/include/p11-kit-1 -DHAVE_LIBGNUTLS -DNDEBUG -Lyes/lib
 -liconv -lintl -lpcre -luuid -lnettle -lgnutls -lz -lintl -liconv
 -lp11-kit -lgmp -lhogweed -lgmp -lnettle -ltasn1 -lp11-kit -lz -lz
 -lidn ftp-opie.o gnutls.o http-ntlm.o ../lib/libgnu.a
 
 Copyright (C) 2014 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later
 http://www.gnu.org/licenses/gpl.html.
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.
 
 Originally written by Hrvoje Niksic hnik...@xemacs.org.
 Please send bug reports and questions to bug-wget@gnu.org.
 
 
 
 Sample failure:
 
 C:\cygwin\home\user\wgetsrc\wget.exe
 https://github.com/openslide/openslide/releases/download/v3.4.0/openslide-3.
 4.0.tar.xz --2015-01-03 06:41:09--
 https://github.com/openslide/openslide/releases/download/v3.4.0/openslide-3.
 4.0.tar.xz Connecting to github.com (github.com)|192.30.252.131|:443...
 connected. HTTP request sent, awaiting response... 302 Found
 Location:
 https://s3.amazonaws.com/github-cloud/releases/827644/96a6c9f0-8629-11e3-9b0
 5-739b7f3aed83.xz?response-content-disposition=attachment%3B%20filename%3Dop
 enslide-3.4.0.tar.xzresponse-content-type=application/octet-streamAWSAcces
 sKeyId=AKIAISTNZFOVBIJMK3TQExpires=1420267329Signature=5d6RHL2oIvoDkET%2BZ
 SiGgT4ZCY0%3D [following]
 --2015-01-03 06:41:09--
 https://s3.amazonaws.com/github-cloud/releases/827644/96a6c9f0-8629-11e3-9b0
 5-739b7f3aed83.xz?response-content-disposition=attachment;%20filename=opensl
 ide-3.4.0.tar.xzresponse-content-type=application/octet-streamAWSAccessKey
 Id=AKIAISTNZFOVBIJMK3TQExpires=1420267329Signature=5d6RHL2oIvoDkET+ZSiGgT4
 ZCY0= Connecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.2.56|:443...
 connected.
 
 HTTP request sent, awaiting response... 403 Forbidden
 2015-01-03 06:41:09 ERROR 403: Forbidden.
 
 --2015-01-03 06:41:09--
 https://github.com/openslide/openslide/releases/download/v3.4.0/openslide-3.
 4.0.tar.xz Connecting to github.com (github.com)|192.30.252.131|:443...
 connected. HTTP request sent, awaiting response... 302 Found
 Location:
 https://s3.amazonaws.com/github-cloud/releases/827644/96a6c9f0-8629-11e3-9b0
 5-739b7f3aed83.xz?response-content-disposition=attachment%3B%20filename%3Dop
 enslide-3.4.0.tar.xzresponse-content-type=application/octet-streamAWSAcces
 sKeyId=AKIAISTNZFOVBIJMK3TQExpires=1420267330Signature=MK1DBqMuFnPJ2BF0dOT
 SnPxDkw0%3D [following]
 --2015-01-03 06:41:10--
 https://s3.amazonaws.com/github-cloud/releases/827644/96a6c9f0-8629-11e3-9b0
 5-739b7f3aed83.xz?response-content-disposition=attachment;%20filename=opensl
 ide-3.4.0.tar.xzresponse-content-type=application/octet-streamAWSAccessKey
 Id=AKIAISTNZFOVBIJMK3TQExpires=1420267330Signature=MK1DBqMuFnPJ2BF0dOTSnPx

Re: [Bug-wget] IDN and IRI tests fail on MS-Windows with wget 1.16.1

2014-12-27 Thread Tim Rühsen
Am Samstag, 27. Dezember 2014, 10:39:25 schrieb Eli Zaretskii:
  From: Tim Rühsen tim.rueh...@gmx.de
  Date: Thu, 25 Dec 2014 15:43:27 +0100
  
FAIL: Test-idn-headers.px
FAIL: Test-idn-meta.px
  
  These use EUC_JP encoded file name, but do not state
  --local-encoding on the wget command line, so the non-ASCII
  characters get mangled by Windows (because Windows tries to convert
  non-Unicode non-ASCII strings to the current system codepage).
  Test-idn-* tests that do state --local-encoding do succeed.  Is it
  possible that the tests assume something about the local encoding,
  like that it's UTF-8?
  
  Let's start with 'Test-idn-meta'.
  No non-ASCII filename will be written to disk, the Content-type is stated
  correctly. --local-encoding set the encoding for when reading a local file
  or the command line. So it shouldn't influence this test. And i can't
  reproduce the stated behavior.
  
  Please send me the --debug output of this test with and without --local-
  encoding given.
 
 The output is attached.  I collected that by redirecting the test
 script's stderr to a file, I hope that's what you meant.
 
 I noticed that the output says:
 
   converted 'http://bunch of octal escapes/' (CP1255) - 'http://another
 bunch of octal escapes/' (UTF-8)
 
 So I tried to use --local-encoding=EUC-JP, and that made the test
 succeed.  The third attachment below is from that successful run.

Thanks, Eli.

Your tests helped me to reproduce the problem:
- install (and set) a non-UTF-8 and non-C/POSIX locale
- use this locale for testing, e.g.:
  TESTS_ENVIRONMENT=LC_ALL=de_DE.iso885915@euro make check TESTS=Test-idn-
meta

And what I see in the logs Wget has a severe problem.
When loading a saved (HTML) document, Wget parses it with the local-encoding 
instead of the encoding stated by the server (or document). Of course this 
can't work and this is the reason why your 3rd test works (setting the local-
encoding to the real encoding of the document).

After the 400 server response, Wget loads the document again, now with the 
correct encoding. But Wget 'remembers' some incorrect conversions from the 
first try and thus fails again.


I would expect Wget to load the document with the correct encoding in the first 
place... but it looks that this 'double loading' has been done on purpose.

Can anyone bring some light here before I fix Wget's behavior, please !

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] IDN and IRI tests fail on MS-Windows with wget 1.16.1

2014-12-27 Thread Tim Rühsen
Am Samstag, 27. Dezember 2014, 13:57:21 schrieb Tim Rühsen:
 Am Samstag, 27. Dezember 2014, 10:39:25 schrieb Eli Zaretskii:
   From: Tim Rühsen tim.rueh...@gmx.de
   Date: Thu, 25 Dec 2014 15:43:27 +0100
   
 FAIL: Test-idn-headers.px
 FAIL: Test-idn-meta.px
   
   These use EUC_JP encoded file name, but do not state
   --local-encoding on the wget command line, so the non-ASCII
   characters get mangled by Windows (because Windows tries to convert
   non-Unicode non-ASCII strings to the current system codepage).
   Test-idn-* tests that do state --local-encoding do succeed.  Is it
   possible that the tests assume something about the local encoding,
   like that it's UTF-8?
   
   Let's start with 'Test-idn-meta'.
   No non-ASCII filename will be written to disk, the Content-type is
   stated
   correctly. --local-encoding set the encoding for when reading a local
   file
   or the command line. So it shouldn't influence this test. And i can't
   reproduce the stated behavior.
   
   Please send me the --debug output of this test with and without --local-
   encoding given.
  
  The output is attached.  I collected that by redirecting the test
  script's stderr to a file, I hope that's what you meant.
  
  I noticed that the output says:
converted 'http://bunch of octal escapes/' (CP1255) -
'http://another
  
  bunch of octal escapes/' (UTF-8)
  
  So I tried to use --local-encoding=EUC-JP, and that made the test
  succeed.  The third attachment below is from that successful run.
 
 Thanks, Eli.
 
 Your tests helped me to reproduce the problem:
 - install (and set) a non-UTF-8 and non-C/POSIX locale
 - use this locale for testing, e.g.:
   TESTS_ENVIRONMENT=LC_ALL=de_DE.iso885915@euro make check TESTS=Test-idn-
 meta
 
 And what I see in the logs Wget has a severe problem.
 When loading a saved (HTML) document, Wget parses it with the local-encoding
 instead of the encoding stated by the server (or document). Of course this
 can't work and this is the reason why your 3rd test works (setting the
 local- encoding to the real encoding of the document).
 
 After the 400 server response, Wget loads the document again, now with the
 correct encoding. But Wget 'remembers' some incorrect conversions from the
 first try and thus fails again.
 
 
 I would expect Wget to load the document with the correct encoding in the
 first place... but it looks that this 'double loading' has been done on
 purpose.

After having a deeper look into IRI/IDN design of Wget I have to correct 
myself. IMHO, Wget's IRI support seems to be deeply broken. I guess it needs a 
redesign to fix it. And that exceeds the amount of time that I have.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Need wget feature defending against evil ISP's HPPT 302 HIJACK

2014-12-25 Thread Tim Rühsen
Am Mittwoch, 24. Dezember 2014, 08:48:46 schrieb Dawei Tong:
 Hell wget developers:I live in China and has an China TieTong
 Telecommunications DSL connetion .This ISP 's servers continous sending
 http 302 redirect with junk/AD link that corrupt my downloading files. I
 found this by analyzing the corrupted files, i compared  2 corrupted files
 from the same source and found they have inserted junk data to normal
 files.The testing file is a world of tanks game installer, i downloaded
 twice, both are corrupted. Here is my test result:cmp -b -l
 b1_WoT.0.9.4_cn_setup.944980-2.bin b2_WoT.0.9.4_cn_setup.944980-2.bin
 456582373 261 M-1  110 H

...

 
 Need feature to keep file downloaded intact.


If manipulation via redirection is your only concern:

1. Try to use the IP address of the download server directly instead of the 
domain name.
2. Try to download via HTTPS with the --https-only option. At least it would 
be much more work for your ISP to proper manipulate the HTTPS protocol.

Also, for many downloads you'll find checksums on different sites. Make sure 
they are all the same and compare them with the checksum of your download.

In any case, have a look at Wget output to detect redirections. But be aware 
of the fact that it is very easy to intercept HTTP connections to manipulate 
downloads on the fly (without redirection). Comparing (trusted) checksums is 
the only save way to detect manipulations in this case.

Good luck !

And if everything fails, ask a friend with a different ISP to download the file 
for you ;-)

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Wget 1.16.1 detection of non-system openssl broken on MacOSX.

2014-12-25 Thread Tim Rühsen
Am Donnerstag, 18. Dezember 2014, 16:06:01 schrieb Jochen Roderburg:
 Hi Tim,

 Am 16.12.2014 um 14:00 schrieb Tim Ruehsen:
  That is not so easy, since when having
  --with-libssl-prefix=/usr/local/ssl,
  you can't just
  OPENSSL_CFLAGS=$with_libssl_prefix/include
  OPENSSL_LIBS=$with_libssl_prefix/lib
  You also have to manually specify the libraries you want to link with...
  this can be different on different systems.

 You are right, and this does also not ensure that the wanted libraries
 are actually used at run-time, on Linux e.g. the OPENSSL_LIBS must also
 somehow be in the dynamic library load path.

  So I decided to set PKG_CONFIG_PATH before the check and unset it
  afterwards. It works for me... but I failed to test it without
  pkg-config. Without pkg- config I couldn't get autoreconf working :-(

 Unfortunately this approach has the same problems as it also does not do
 more than set the inlcude and lib paths.


 I made the following experiments on my Linux:

 Installed a current OPENSSL from source in /usr/local/openssl (with
 openssl configure:  ./config --prefix=/usr/local/openssl shared)

 Configured and built wget 1.16.1 with your PKG_CONFIG_PATH patch (with
 wget configure: configure --with-ssl=openssl
 --with-libssl-prefix=/usr/local/openssl).

 With an active pkg-config program I see now the following in the config
 summary:
 CFlags:  -I/usr/include/uuid   -I/usr/local/openssl/include
 -DHAVE_LIBSSL   -DNDEBUG
 Libs:-lpcre   -luuid   -L/usr/local/openssl/lib -lssl -lcrypto -lz
-lidn

 So the compile and link phase will use the wanted files, but a ldd on
 the resulting wget binary still shows:

  libssl.so.1.0.0 = /lib/libssl.so.1.0.0
  libcrypto.so.1.0.0 = /lib/libcrypto.so.1.0.0

 With a deactivated pkg-config program the old library detection code
 seems to be used and I get:

 CFlags:   -DNDEBUG -I/usr/local/openssl/include
 Libs: /usr/local/openssl/lib/libssl.so
 /usr/local/openssl/lib/libcrypto.so
Wl,-rpath -Wl,/usr/local/openssl/lib/ -ldl -lz  -lidn -luuid
 -lpcre

 and ldd shows now:

  libssl.so.1.0.0 = /usr/local/openssl/lib/libssl.so.1.0.0
  libcrypto.so.1.0.0 = /usr/local/openssl/lib/libcrypto.so.1.0.0

 The wanted openssl libraries are inserted here with full path names and
 really used.


 Therefore I see now only one possibility to keep these two prefix
 options without more problems and new questions, namely skip the
 pkg-config based completely for the SSL libraries if these options are
 used.

I attached a patch to skip pkg-config detection when --with...-prefix is given.
It works for me, but please also review and test.

Tim
From 6f62bc5cd86aa7383cded38f61871afd4964c577 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Thu, 25 Dec 2014 15:21:44 +0100
Subject: [PATCH] configure.ac: Skip pkg-config for opensl and gnutls when
 prefix is given

Make --with-libssl-prefix and --with-libgnutls-prefix do the right thing,
no matter if pkg-config is installed or not.

Reported-by: Charles Diza chd...@gmail.com
---
 configure.ac | 44 ++--
 1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/configure.ac b/configure.ac
index 44f4d6d..01ea237 100644
--- a/configure.ac
+++ b/configure.ac
@@ -334,15 +334,19 @@ AS_IF([test x$with_zlib != xno], [
 ])

 AS_IF([test x$with_ssl = xopenssl], [
-  PKG_CHECK_MODULES([OPENSSL], [openssl], [
-AC_MSG_NOTICE([compiling in support for SSL via OpenSSL])
-AC_LIBOBJ([openssl])
-LIBS=$OPENSSL_LIBS $LIBS
-CFLAGS=$OPENSSL_CFLAGS -DHAVE_LIBSSL $CFLAGS
-LIBSSL=  # ntlm check below wants this
-AC_CHECK_FUNCS([RAND_egd])
-AC_DEFINE([HAVE_LIBSSL], [1], [Define if using openssl.])
-  ], [
+  if [test x$with_libssl_prefix = x]; then
+PKG_CHECK_MODULES([OPENSSL], [openssl], [
+  AC_MSG_NOTICE([compiling in support for SSL via OpenSSL])
+  AC_LIBOBJ([openssl])
+  LIBS=$OPENSSL_LIBS $LIBS
+  CFLAGS=$OPENSSL_CFLAGS -DHAVE_LIBSSL $CFLAGS
+  LIBSSL=  # ntlm check below wants this
+  AC_CHECK_FUNCS([RAND_egd])
+  AC_DEFINE([HAVE_LIBSSL], [1], [Define if using openssl.])
+  ssl_found=yes
+])
+  fi
+  if [test x$ssl_found != xyes]; then
 dnl As of this writing (OpenSSL 0.9.6), the libcrypto shared library
 dnl doesn't record its dependency on libdl, so we need to make sure
 dnl -ldl ends up in LIBS on systems that have it.  Most OSes use
@@ -399,7 +403,7 @@ AS_IF([test x$with_ssl = xopenssl], [
 AC_MSG_ERROR([--with-ssl=openssl was given, but SSL is not available.])
   fi
 ])
-  ])
+  fi
 ], [
   # --with-ssl is not openssl: check if it's no
   AS_IF([test x$with_ssl != xno], [
@@ -407,13 +411,17 @@ AS_IF([test x$with_ssl = xopenssl], [
 with_ssl=gnutls

 dnl Now actually check for -lgnutls
-PKG_CHECK_MODULES([GNUTLS], [gnutls], [
-  AC_MSG_NOTICE([compiling in support for SSL via GnuTLS])
-  AC_LIBOBJ([gnutls])
-  

Re: [Bug-wget] IDN and IRI tests fail on MS-Windows with wget 1.16.1

2014-12-25 Thread Tim Rühsen
Am Samstag, 20. Dezember 2014, 10:28:53 schrieb Eli Zaretskii:
 I've looked into the failing tests.  Here's the list of failed tests
 and my conclusions from looking at the logs and the test scripts:
 
  FAIL: Test-idn-headers.px
  FAIL: Test-idn-meta.px
 
These use EUC_JP encoded file name, but do not state
--local-encoding on the wget command line, so the non-ASCII
characters get mangled by Windows (because Windows tries to convert
non-Unicode non-ASCII strings to the current system codepage).
Test-idn-* tests that do state --local-encoding do succeed.  Is it
possible that the tests assume something about the local encoding,
like that it's UTF-8?

Let's start with 'Test-idn-meta'.
No non-ASCII filename will be written to disk, the Content-type is stated 
correctly. --local-encoding set the encoding for when reading a local file or 
the command line. So it shouldn't influence this test. And i can't reproduce 
the stated behavior.

Please send me the --debug output of this test with and without --local-
encoding given.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] FTP tests fail on MS-Windows

2014-12-21 Thread Tim Rühsen
Am Sonntag, 21. Dezember 2014, 05:43:09 schrieb Eli Zaretskii:
  From: Tim Rühsen tim.rueh...@gmx.de
  Date: Sat, 20 Dec 2014 22:16:16 +0100
  
# FTP Server has to start with english locale due to use of
strftime

month names in LIST command

setlocale(LC_ALL, 'C');
$self-_launch_server(
   
   Thanks.  But how can this explain the 'index.html' thingy appearing in
   the FTP listing, instead of the expected afile.txt etc.?
  
  The .listing file can't be parsed correctly when the month names are
  incorrect.
 I see.  But then this is not the problem in my case.  Here's the
 listing I see in one of the FTP test logs:
 
   226 Listing complete. Data connection has been closed.
   -r--r--r-- 1  0  0  12 Dec  12:43 franτais.txt
   2014-12-19 12:43:03 (362 KB/s) - '.listing' saved [48]
 
 As you see, even though the file name includes non-ASCII characters,
 the month name is in English (which is what I'd expect, given the
 locale I have here).
 
  But I know, there are a few Windows users / developers reading this.
  Maybe they can help or bring some light !?
 
 I certainly hope so.
 
 Could the problem be that the listing has a CR-LF end-of-line format?
 Could that interfere with its parsing?
 
 Thanks.

Back to  Test-ftp-bad-list.px... if you add the following line to 
FTPServer.pm, you can see the difference between the GNU/Linux and the Windows 
.listing. The listing has two lines, so a difference of 4 bytes (reported in 
your original post) seems not to be a CRLF problem.
Also, as you can see below, even on GNU/Linux CRLF is used and not LF alone.


diff --git a/tests/FTPServer.pm b/tests/FTPServer.pm
index 3d7d8a5..261e819 100644
--- a/tests/FTPServer.pm
+++ b/tests/FTPServer.pm
@@ -134,6 +134,7 @@ sub _LIST_command
 {
 for my $item (@$listing)
 {
+print STDERR $item\r\n;
 print $sock $item\r\n;
 }
 }

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] FTP tests fail on MS-Windows

2014-12-21 Thread Tim Rühsen
Am Sonntag, 21. Dezember 2014, 18:30:28 schrieb Eli Zaretskii:
  From: Tim Rühsen tim.rueh...@gmx.de
  Date: Sun, 21 Dec 2014 13:13:50 +0100
  
The .listing file can't be parsed correctly when the month names are
incorrect.
   
   I see.  But then this is not the problem in my case.  Here's the
   
   listing I see in one of the FTP test logs:
 226 Listing complete. Data connection has been closed.
 -r--r--r-- 1  0  0  12 Dec  12:43 franτais.txt
 2014-12-19 12:43:03 (362 KB/s) - '.listing' saved [48]
   
   As you see, even though the file name includes non-ASCII characters,
   the month name is in English (which is what I'd expect, given the
   locale I have here).
   
But I know, there are a few Windows users / developers reading this.
Maybe they can help or bring some light !?
   
   I certainly hope so.
   
   Could the problem be that the listing has a CR-LF end-of-line format?
   Could that interfere with its parsing?
   
   Thanks.
  
  Back to  Test-ftp-bad-list.px... if you add the following line to
  FTPServer.pm, you can see the difference between the GNU/Linux and the
  Windows .listing. The listing has two lines, so a difference of 4 bytes
  (reported in your original post) seems not to be a CRLF problem.
 
 Right you are.  I looked at it from the other end: enabled debugging
 output from wget, and then looked carefully at the output.  And I saw
 this:
 
   .listing[ =  ]  88  --.-KB/s   in 0s
 
   Closed fd 4
   226 Listing complete. Data connection has been closed.
   2014-12-21 18:16:05 (693 KB/s) - '.listing' saved [88]
 
   PLAINFILE; perms 444; size: 0; month: Dec; day: 18; time: 00:00:00 (no
 yr); Skipping. 
 ^^ 
   PLAINFILE; perms 444; size: 0; month: Dec; day: 18; time: 00:00:00 (no
 yr); Skipping.
   Removed '.listing'.
 
 Which then prompted me to take a closer look at the listing wget
 receives as result of LIST:
 
   -r--r--r-- 1  0  0  12 Dec  18:17 afile.txt
  ^^^
 Look, ma: no day of the month!

Outch !

 
 And that leads me to this line in FTPServer.pm:
 
 my $date = strftime(%b %e %H:%M, localtime);
 
 Which is the root cause of the problem: it uses %e, which is a C99
 feature, and is not supported on MS-Windows.  I replaced it with %d,
 and now all the FTP tests succeed!

Thanks for testing, I'll set up a patch for using %d.

%e is not C99 but Single Unix Specification.
And Windows / MS doesn't care for it.
FYI, see http://msdn.microsoft.com/en-us/library/fe06s4ak.aspx

I normally never use strftime in my applications, now I remember one of the 
reasons ;-).


 P.S. Now I'm beginning to wonder whether anyone runs the test suite on
 Windows, or cares about the results...

Looks like nobody does :-(

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Building and testing wget 1.16.1 on MinGW

2014-12-20 Thread Tim Rühsen
Am Samstag, 20. Dezember 2014, 10:37:30 schrieb Eli Zaretskii:
  From: Tim Ruehsen tim.rueh...@gmx.de
  Date: Fri, 19 Dec 2014 12:53:13 +0100
  
   4. make check fails to link test programs, such as unit-tests.exe,
   
  because it doesn't link against libintl.  Again, not sure how best
  to fix that (wget itself does get linked against libintl and
  libiconv).
  
  Dunno.
 
 Btw, I encountered a similar problem building wget on GNU/Linux,
 except that in that case what was missing was -lrt, without which
 clock_gettime and friends cannot be resolved.  So there seems to be
 some general problem here.

Ah, ok. clock_gettime/-lrt problem has been solved by 4.11.2014.
What we (basically, through m4/wget.m4) do is 
  AC_CHECK_FUNCS(clock_gettime, [], [
AC_CHECK_LIB(rt, clock_gettime)
  ])

Than we use LIB_CLOCK_GETTIME in tests/Makefile.am
LDADD = ../src/libunittest.a ../lib/libgnu.a $(LIBS) $(LIB_CLOCK_GETTIME)

Maybe you could add a similar piece of code for libintl.
(Since I can't reproduce and can't test, I need your help here.)
A (for you) working patch would be great !


   Btw, to debug the Test-N issue, I had to add the time stamps being
   compared to the Perl script that runs the test.  It would be nice to
   have that in the tests by default, because just by looking at the time
   stamps, one can immediately understand in what direction to look for
   the problem (in my case I saw a difference of 3600 sec).
  
  Since I am not 100% sure what you are talking about and you already
  extended the (or a) Test-N, please send us diff/patch.
 
 Here it is:
 
 --- tests/WgetTests.pm~   2014-11-01 23:07:48.0 +0200
 +++ tests/WgetTests.pm2014-12-20 10:34:05.920875000 +0200
 @@ -319,7 +319,7 @@ sub _verify_download
= stat $filename;
 
  $mtime == $filedata-{'timestamp'}
 -  or return Test failed: wrong timestamp for file
 $filename\n; +  or return Test failed: wrong timestamp for
 file $filename: expected = $filedata-{'timestamp'}, actual = $mtime\n; }
 
  }

Thanks, Eli.
I pushed it.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] FTP tests fail on MS-Windows

2014-12-20 Thread Tim Rühsen
Am Freitag, 19. Dezember 2014, 18:18:31 schrieb Eli Zaretskii:
 I've built wget 1.16.1 on GNU/Linux as well, and compared the test
 results to try and figure out why some tests fail on Windows.
 
 When I run Test-ftp-bad-list.px on GNU/Linux, I see this output:
 
  Running test Test-ftp-bad-list
  Calling /srv/data/home/e/eliz/wget-1.16.1/tests/../src/wget -nH -Nc -r
 ftp://localhost:40221/ --2014-12-19 11:13:27--  ftp://localhost:40221/
   = ‘.listing’
  Resolving localhost (localhost)... ::1, 127.0.0.1
  Connecting to localhost (localhost)|::1|:40221... failed: Connection
 refused. Connecting to localhost (localhost)|127.0.0.1|:40221... connected.
 Logging in as anonymous ... Logged in!
  == SYST ... done.== PWD ... done.
  == TYPE I ... done.  == CWD not needed.
  == PASV ... done.== LIST ... _LIST_command - dir is: /
  done.
 
  .listing[ =  ]  92  --.-KB/s   in
 0s
 
  2014-12-19 11:13:27 (3.65 MB/s) - ‘.listing’ saved [92]
 
  Removed ‘.listing’.
  The sizes do not match (local 12) -- retrieving.
 
  --2014-12-19 11:13:27--  ftp://localhost:40221/afile.txt
   = ‘afile.txt’
  == CWD not required.
  == SIZE afile.txt ... 12
  File has already been retrieved.
  2014-12-19 11:13:27 (0.00 B/s) - ‘afile.txt’ saved [12]
 
 But on Windows the same test yields this:
 
  Running test Test-ftp-bad-list
  Calling /d/gnu/wget-1.16.1/tests/../src/wget -nH -Nc -r
 ftp://localhost:3244/ --2014-12-19 18:14:54--  ftp://localhost:3244/
   = '.listing'
  Resolving localhost (localhost)... 127.0.0.1
  Connecting to localhost (localhost)|127.0.0.1|:3244... connected.
  Logging in as anonymous ... Logged in!
  == SYST ... done.== PWD ... done.
  == TYPE I ... done.  == CWD not needed.
  == PASV ... done.== LIST ... _LIST_command - dir is: /
  done.
 
  .listing[ =  ]  88  --.-KB/s   in
 0s
 
  2014-12-19 18:14:54 (818 KB/s) - '.listing' saved [88]
 
  Removed '.listing'.
  --2014-12-19 18:14:54--  ftp://localhost:3244/
   = 'index.html'
  == CWD not required.
  == SIZE  ... done.
 
  == PASV ... done.== RETR  ...
  No such file ''.
 
 Note the 'index.html' thing -- that's where the difference between the
 two systems begins.  Looks like the server behaves differently,
 doesn't it?
 
 Does anyone have a clue what is going on here?  Any bells ring for
 anyone?

I fixed a locale thing a while ago. It was FTPServer.pm using strftime() 
generating non-english month names. Wget expects english month names (of 
course). What fixed that for me was a change in WgetTests.pm:

...
# FTP Server has to start with english locale due to use of strftime 
month names in LIST command
setlocale(LC_ALL, 'C');
$self-_launch_server(
...

Maybe this is not enough in your environment... That also might be the reason 
for other failures. Do you have a recent version of the auto* tools installed 
?

Maybe you can tell me your locale settings and/or try using the same locale 
settings on Gnu/Linux.

'make check' explicitly sets LC_ALL=C.
So i could produce these errors only with
TESTS_ENVIRONMENT=LC_ALL=de_DE.utf8 make check

Since these things are 'fixed for me', I can't reproduce them any more. And 
thus can't be of much further help.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] FTP tests fail on MS-Windows

2014-12-20 Thread Tim Rühsen
Am Samstag, 20. Dezember 2014, 22:06:01 schrieb Eli Zaretskii:
  From: Tim Rühsen tim.rueh...@gmx.de
  Date: Sat, 20 Dec 2014 18:18:20 +0100
  
   But on Windows the same test yields this:
Running test Test-ftp-bad-list
Calling /d/gnu/wget-1.16.1/tests/../src/wget -nH -Nc -r
   
   ftp://localhost:3244/ --2014-12-19 18:14:54--  ftp://localhost:3244/
   
 = '.listing'
 
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:3244... connected.
Logging in as anonymous ... Logged in!
== SYST ... done.== PWD ... done.
== TYPE I ... done.  == CWD not needed.
== PASV ... done.== LIST ... _LIST_command - dir is: /
done.

.listing[ =  ]  88  --.-KB/s 
 in
   
   0s
   
2014-12-19 18:14:54 (818 KB/s) - '.listing' saved [88]

Removed '.listing'.
--2014-12-19 18:14:54--  ftp://localhost:3244/
 
 = 'index.html'
 
== CWD not required.
== SIZE  ... done.

== PASV ... done.== RETR  ...
No such file ''.
   
   Note the 'index.html' thing -- that's where the difference between the
   two systems begins.  Looks like the server behaves differently,
   doesn't it?
   
   Does anyone have a clue what is going on here?  Any bells ring for
   anyone?
  
  I fixed a locale thing a while ago. It was FTPServer.pm using strftime()
  generating non-english month names. Wget expects english month names (of
  course). What fixed that for me was a change in WgetTests.pm:
  
  ...
  
  # FTP Server has to start with english locale due to use of
  strftime
  
  month names in LIST command
  
  setlocale(LC_ALL, 'C');
  $self-_launch_server(
 
 Thanks.  But how can this explain the 'index.html' thingy appearing in
 the FTP listing, instead of the expected afile.txt etc.?

The .listing file can't be parsed correctly when the month names are incorrect.

  Do you have a recent version of the auto* tools installed ?
 
 For some value of recent, yes.  How are autotools related to this
 issue?

'make check' explicitly sets LC_ALL=C. For some reason it sems to fail in your 
environment. You could add a print line to the server perl code to see what 
the settings are it runs with.


  Maybe you can tell me your locale settings and/or try using the same
  locale
  settings on Gnu/Linux.
 
 It's the Windows equivalent of en_US.cp1255.
 
  'make check' explicitly sets LC_ALL=C.
  So i could produce these errors only with
  TESTS_ENVIRONMENT=LC_ALL=de_DE.utf8 make check
 
 I see the same problems no matter whether I run make check or the
 individual tests from the Bash prompt.

Either auto* tools are broken (or let's say the test suite code) or your 
environment behavesw unexpected.

But I simply can't fix it for you, I just don't (and won't) have your system / 
environment and can't reproduce these issues. I just can guess from what you 
tell me. But I know, there are a few Windows users / developers reading this. 
Maybe they can help or bring some light !?

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Wget 1.16.1 detection of non-system openssl broken on MacOSX.

2014-12-14 Thread Tim Rühsen
Am Sonntag, 14. Dezember 2014, 18:12:05 schrieb Jochen Roderburg:
  OK, that worked, thanks; indeed, all I had to do was
  'PKG_CONFIG_PATH=/usr/local/ssl/lib/pkgconfig ./configure blah blah'. 
  Easy
  enough.  (That's the default location for a built-from-source openssl;
  is
  openssl not putting its .pc file where it should?)
  
  I guess yes, if you 'make install' your local copy of OpenSSL.
  
  But that's only half the battle, because that only covers the case where
  the Mac user has pkg-config installed.  Pkg-config doesn't come with OSX
  or
  the Apple dev tools.  Up through wget 1.16, the pkgconfigless Mac user
  could rely on --with-libssl-prefix to point wget to the right place.
  
  Please see the output of ./configure --help.
  If you don't have pkg-config installed, please try the following
  Add -I/usr/local/ssl/include to your CFLAGS
  
and add -L/usr/local/ssl/lib to your LDFLAGS.
  
  export both and ./configure.
 
 I have seen now here several work-arounds like the above for this issue,
 but no real answer to the OP's question: Why does a
 
./configure --with-ssl=openssl --with-libssl-prefix=/usr/local/ssl
 
 with wget 1.16.1 no longer give the same results as earlier versions in
 the situation on his system. From the discussion I understand that his
 situation is: the unwanted library installation is found by pkg-config
 and the wanted installation is not.
 
 In the current configure script I see that the --with-libssl-prefix
 option (and probably also all the other --with-xx-prefix options) is
 only handled by the old library detection code and not by the new
 pkg-config based detection. So when a library is found by pkg-config it
 cannot be overrun any longer by these configure options.
 
 I think the clean and compatible way to handle this issue would be to
 change the sequence of these three ways to find a library to: first
 respect the --with-xx-prefix option, then use the pkg-config method
 and finally the old detection code. Maybe one could also have a look how
 other projects handle this which offer similar options.

Thanks for the sum up, Jochen.

I already spent some time to find a project that handles such cases, unlucky so 
far. I often could see the use of pkg-config without a fallback.
The docs say: install pkg-config or die.
That is also an option for us, especially since pkg-config becomes standard 
more and more. But of course, If we find an example how to implement that *-
prefix stuff in a proper and maintainable manner (easy to read and understand) 
in configure.ac, that is the way to go.

I appreciate any help in finding an example.
Else we have to amend the documentation... I already made up a patch for the 
docs, but I am willing to wait a while before I push it.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Wget 1.16.1 detection of non-system openssl broken on MacOSX.

2014-12-13 Thread Tim Rühsen
Am Freitag, 12. Dezember 2014, 22:16:38 schrieb Darshit Shah:
 On Fri, Dec 12, 2014 at 9:45 PM, Tim Ruehsen tim.rueh...@gmx.de wrote:
  On Thursday 11 December 2014 11:51:27 Charles Diza wrote:
  On Thu, Dec 11, 2014 at 4:39 AM, Tim Ruehsen tim.rueh...@gmx.de wrote:
   On Wednesday 10 December 2014 12:02:32 Charles Diza wrote:
Wget 1.16.1 has broken detection of non-built-in openssl on MacOSX.

Openssl comes with MacOSX but it's deprecated by Apple and it's an
old
version.  For this reason, many MacOSX users custom install a newer
openssl and put it in /usr/local/ssl (which, IIRC, is the default
location for custom openssl installs).

Up through wget 1.16, the following configure flags sufficed to make
wget's configure script recognize this custom openssl and *use* it:

./configure --with-ssl=openssl --with-libssl-prefix=/usr/local/ssl

But on wget 1.16.1, those same flags have no effect, and wget is
built
against the Mac system openssl in /usr/lib, which is old and
deprecated.
Something in the configure script must have changed.

I hope that this is either repaired, or that the README/INSTALL are
amended to include special instructions on how to force wget to pick
up
a custom openssl on MacOSX.

I'm no programmer, but I have a hunch that the same batch of
pkg-config
related changes (2014-11-01 in the ChangeLog) that broke pcre
handling
on MacOSX (See earlier thread) have broken openssl detection.

I do have pkg-config on my system, in /usr/local.  I have found that
whether or not I remove pkg-config from my system, I can't get
openssl
in /usr/local/ssl to get picked up and used to link with lines.
   
   Please try the following:
   - make a copy of openssl.pc (the pkg-config file of OpenSSL) into your
   wget
   directory.
   - change the first line 'prefix=...' to 'prefix=/usr/local/ssl'
   - try 'PKG_CONFIG_PATH=. ./configure --with-ssl=openssl'
   
   Later, you may keep your openssl.pc in /usr/local/pkgconfig/, so you
   can
   easily find and use it with other projects.
   
   Please report if this (or similar) works for you.
   Of course that has to documented... we simply didn't fall over this
   issue
   so
   far.
  
  OK, that worked, thanks; indeed, all I had to do was
  'PKG_CONFIG_PATH=/usr/local/ssl/lib/pkgconfig ./configure blah blah'. 
  Easy
  enough.  (That's the default location for a built-from-source openssl; is
  openssl not putting its .pc file where it should?)
  
  I guess yes, if you 'make install' your local copy of OpenSSL.
  
  But that's only half the battle, because that only covers the case where
  the Mac user has pkg-config installed.  Pkg-config doesn't come with OSX
  or
  the Apple dev tools.  Up through wget 1.16, the pkgconfigless Mac user
  could rely on --with-libssl-prefix to point wget to the right place.
  
  Please see the output of ./configure --help.
  If you don't have pkg-config installed, please try the following
  Add -I/usr/local/ssl/include to your CFLAGS
  
   and add -L/usr/local/ssl/lib to your LDFLAGS.
  
  export both and ./configure.
 
 But shouldn't openssl detection work without pkg-config too? We did
 retain the old detection code as a fallback mechanism in case
 pkg-config didn't work.

Please re-read the thread. pkg-config has no problems detecting OpenSSL.
We are talking about how pkg-config works with a second (custom) installation 
of OpenSSL (in /usr/local/ssl) and already solved that issue.

Now we are at the point where we have to figure out how this procedure works 
without pkg-config. And that will be a different approach. I have a 'works for 
me' solution. But I want Charles to test it on his Mac.

 Given the number of complains we've received about this, I think its
 time to look back into configure.ac and figure out where that
 detection is going wrong. Users shouldn't have to do all these
 shenanigans to get Wget to compile.

number of complains ? I just count exactly one. But that is not the point. 
As Charles pointed out, it is a general problem. OSX's package 
management/organization is just a bit different than for most Linux 
distributions. That's why it came up there first.

It seems the issue can be solved by a proper documentation. I already put a 
patch for README.checkout on the list. But I guess, I have to edit/extend it 
again.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Script to check Wget status (for developers)

2014-12-05 Thread Tim Rühsen
Am Freitag, 5. Dezember 2014, 18:06:15 schrieb Giuseppe Scrivano:
 Tim Ruehsen tim.rueh...@gmx.de writes:
  +for CC in gcc clang-3.6; do
 
 could we replace clang-3.6 with just clang?

;-) Yes

On my office machine I do not have a 'clang' command, instead just clang-3.5 
and 
clang-3.6 (but as I tested right now, an 'apt-get install clang' is what fixes 
it).

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] OpenSSL TLSv1+ regression in wget-1.16

2014-12-03 Thread Tim Rühsen
Am Mittwoch, 3. Dezember 2014, 12:36:33 schrieb Jérémie Courrèges-Anglas:
 Hi,
 
 Giuseppe Scrivano gscriv...@gnu.org writes:
 
 [...]
 
  we should also hide --rand-egd from wget --help and do not accept this
  option when HAVE_RAND_EGD is not set.
 
 I thought about that and took the lazy approach: the option is still
 available even if gnutls is used, even though it's a nop.  Why then
 change the interface if libressl is used instead of openssl/gnutls?
 
 Or maybe this was merely overlooked and openssl should really be
 a special case here, dunno.

IMHO, we should accept --rand-egd to not introduce regressions.
But instead of silently ignoring the users demand, we should print a warning 
about the LibreSSL/RAND_egd() issue. Maybe saying, that a modern /dev/random 
is more secure than the EGD ?

It would not be nice if someone loses security without being warned.

 Or... another alternative would be to get rid of RAND_egd altogether,
 with --egd-file staying for compat for a few releases. :)

The question here is, where and in which way is EGD still useful !?
Maybe it is already obsolete on very most systems ?
We should keep this in mind for 1.17+.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Fixing C89 warnings

2014-12-03 Thread Tim Rühsen
Thanks, Gisle.

It is pushed.

  That's really good news to me. But there are still lots
  more C99 errors. Espesially in main.c.

Feel free to send patches.

Sadly, I can't work with -Werror (but use -std=c89 -pedantic -Wall -Wextra + 
more). And I also might oversee warnings because the scrolling is too fast.

Thanks to have a thorough look at current changes.

Tim


Am Mittwoch, 3. Dezember 2014, 13:41:13 schrieb Gisle Vanem:
  That's really good news to me. But there are still lots
  more C99 errors. Espesially in main.c.
 
 Now another C99 error got into openssl.c. Patch:
 
 --- Git-latest/src/openssl.c2014-12-03 14:06:19 +
 +++ openssl.c   2014-12-03 14:14:37 +
 @@ -170,6 +170,7 @@
   ssl_init (void)
   {
 SSL_METHOD const *meth;
 +  long ssl_options = 0;
 
   #if OPENSSL_VERSION_NUMBER = 0x00907000
 if (ssl_true_initialized == 0)
 @@ -203,8 +204,6 @@
 SSLeay_add_all_algorithms ();
 SSLeay_add_ssl_algorithms ();
 
 -  long ssl_options = 0;
 -
 switch (opt.secure_protocol)
   {
   #ifndef OPENSSL_NO_SSL2
 
 ---
 
 FYI. There are gcc options to trigger an error for these
cases. Such as gcc -Wdeclaration-after-statement -Werror.
But there are other harmless warnings.


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Fix #41235

2014-12-01 Thread Tim Rühsen
Am Montag, 1. Dezember 2014, 18:38:07 schrieb Giuseppe Scrivano:
 Tim Ruehsen tim.rueh...@gmx.de writes:
  Fix some issues found by 'parfait' static analysis scanner.
  Even if these are not 'real' bugs, the changes look like good programming.
  
  Please review.
  
  Tim
  
  From aa34caf27a2d6a4bf245aa4b94518b0ee47d648b Mon Sep 17 00:00:00 2001
  From: =?UTF-8?q?Tim=20R=C3=BChsen?= tim.rueh...@gmx.de
  Date: Mon, 1 Dec 2014 17:02:33 +0100
  Subject: [PATCH] Fix issues reported by static code analysis tool
  'parfait'
 
 ACK

Pushed.

Tim

signature.asc
Description: This is a digitally signed message part.


[Bug-wget] [PATCH] Replaced-xfree_null-by-xfree-and-nullify-argument-aftfer-freeing

2014-11-29 Thread Tim Rühsen
This patch removes xfree_null() and replaces it by xfree().
xfree() automatically nullifies the argument after freeing to reduce dangling
pointer problems like double frees.

I also cleaned up code like
if (p)
  xfree (p);

and
xfree(p);
p = NULL;

Both cases reduce to a call to xfree(p).

check and distcheck tested, also valgrind tested.

Please review and comment.

TimFrom ede5a27de1e2e5447ec31a918af789e657b53b1b Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Sat, 29 Nov 2014 17:54:20 +0100
Subject: [PATCH] Replaced xfree_null() by xfree() and nullify argument after
 freeing.

---
 src/ChangeLog   |   9 +
 src/connect.c   |   3 +-
 src/cookies.c   |  12 +++---
 src/ftp-basic.c |   2 +-
 src/ftp-ls.c|   6 +--
 src/ftp.c   |  12 +++---
 src/hash.c  |   9 +++--
 src/host.c  |   3 +-
 src/html-url.c  |   7 ++--
 src/http.c  | 119 +---
 src/init.c  |  86 
 src/iri.c   |   8 ++--
 src/log.c   |  11 ++
 src/main.c  |   8 ++--
 src/mswindows.c |   4 +-
 src/netrc.c |  14 +++
 src/openssl.c   |   4 +-
 src/recur.c |   6 +--
 src/res.c   |   3 +-
 src/retr.c  |  21 +-
 src/url.c   |  16 
 src/utils.h |   7 +---
 src/warc.c  |   6 +--
 23 files changed, 176 insertions(+), 200 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index ac2542c..920d822 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,12 @@
+2014-11-29  Tim Ruehsen tim.rueh...@gmx.de
+
+	* utils.h: xfree() sets argument to NULL after freeing,
+	removed xfree_null()
+	* connect.c, cookies.c, ftp-basic.c, ftp-ls.c, ftp.c hash.c,
+	host.c, html-url.c, http.c, init.c, iri.c, log.c, main.c,
+	mswindows.c, netrc.c, openssl.c, recur.c, res.c, retr.c,
+	url.c, warc.c: Replaced xfree_null() by xfree()
+
 2014-11-28  Tim Ruehsen tim.rueh...@gmx.de

 	* main.c: Fix length of program_argstring,
diff --git a/src/connect.c b/src/connect.c
index bc8d133..727d6a6 100644
--- a/src/connect.c
+++ b/src/connect.c
@@ -284,8 +284,7 @@ connect_to_ip (const ip_address *ip, int port, const char *print)
   logprintf (LOG_VERBOSE, _(Connecting to %s|%s|:%d... ),
  str ? str : escnonprint_uri (print), txt_addr, port);

-  if (str)
-  xfree (str);
+  xfree (str);
 }
   else
 {
diff --git a/src/cookies.c b/src/cookies.c
index 96af412..d4e0222 100644
--- a/src/cookies.c
+++ b/src/cookies.c
@@ -153,10 +153,10 @@ cookie_expired_p (const struct cookie *c)
 static void
 delete_cookie (struct cookie *cookie)
 {
-  xfree_null (cookie-domain);
-  xfree_null (cookie-path);
-  xfree_null (cookie-attr);
-  xfree_null (cookie-value);
+  xfree (cookie-domain);
+  xfree (cookie-path);
+  xfree (cookie-attr);
+  xfree (cookie-value);
   xfree (cookie);
 }

@@ -376,7 +376,7 @@ parse_set_cookie (const char *set_cookie, bool silent)
 {
   if (!TOKEN_NON_EMPTY (value))
 goto error;
-  xfree_null (cookie-domain);
+  xfree (cookie-domain);
   /* Strictly speaking, we should set cookie-domain_exact if the
  domain doesn't begin with a dot.  But many sites set the
  domain to foo.com and expect subhost.foo.com to get the
@@ -389,7 +389,7 @@ parse_set_cookie (const char *set_cookie, bool silent)
 {
   if (!TOKEN_NON_EMPTY (value))
 goto error;
-  xfree_null (cookie-path);
+  xfree (cookie-path);
   cookie-path = strdupdelim (value.b, value.e);
 }
   else if (TOKEN_IS (name, expires))
diff --git a/src/ftp-basic.c b/src/ftp-basic.c
index 2f5765e..f9b9ad2 100644
--- a/src/ftp-basic.c
+++ b/src/ftp-basic.c
@@ -1136,7 +1136,7 @@ ftp_pwd (int csock, char **pwd)
 goto err;

   /* Has the `pwd' been already allocated?  Free! */
-  xfree_null (*pwd);
+  xfree (*pwd);

   *pwd = xstrdup (request);

diff --git a/src/ftp-ls.c b/src/ftp-ls.c
index e129191..399d1b4 100644
--- a/src/ftp-ls.c
+++ b/src/ftp-ls.c
@@ -363,8 +363,8 @@ ftp_parse_unix_ls (const char *file, int ignore_perms)
   if (error || ignore)
 {
   DEBUGP ((Skipping.\n));
-  xfree_null (cur.name);
-  xfree_null (cur.linkto);
+  xfree (cur.name);
+  xfree (cur.linkto);
   continue;
 }

@@ -1089,7 +1089,7 @@ ftp_index (const char *file, struct url *u, struct fileinfo *f)
   else
 upwd = concat_strings (tmpu, @, (char *) 0);
   xfree (tmpu);
-  xfree_null (tmpp);
+  xfree (tmpp);
 }
   else
 upwd = xstrdup ();
diff --git a/src/ftp.c b/src/ftp.c
index e57c21c..9ea8819 100644
--- a/src/ftp.c
+++ b/src/ftp.c
@@ -449,7 +449,7 @@ Error in server response, closing control connection.\n));
   return err;
 case FTPSRVERR :
   /* PWD unsupported -- assume /. */
-  xfree_null 

Re: [Bug-wget] How to prevent .1.html numbering of downloaded file?

2014-11-28 Thread Tim Rühsen
Am Freitag, 28. November 2014, 16:38:43 schrieb B Wooster:
 OK some more info after some debugging.
 
 Looks like the problem is in the unique_name function. At that point, it
 does not know about adjust-extensions, so it always checks for name without
 the extension. And depending on how things are queued, it can cause correct
 or incorrect behavior. Anyone know if this is an existing issue, and any
 known workaround? I can locally change wget if necessary, and will likely
 do that after I figure it out.
 
 So if things are queued like this, it is all fine:
 article  (will save to article.html but calls unique_name with just
 article which luckily does not exist)
 article/post.html (will save to article/post.html, creating directory
 article)
 
 but this will mess it up:
 article/post.html (will save to article/post.html)
 article  (will save to article.html but calls unique_name with just
 article which by now exists).
 
 Sorting the queue (but then it is no long a queue!) or better still:
 checking unique_name after adjust extensions has produced a suffix would
 fix this. Any one have any tips?

Thanks for having a look into the issue.

The second seems to be the better choice.

I suggest you create a python test (testenv/ directory) to have a simple 
showcase that confirms your assumptions and proves your patch right (if you are 
creating one).

If you are able and willing to create a test and/or patch we would appreciate 
that very much. Just send it here for further discussion.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Check for flex before compiling

2014-11-23 Thread Tim Rühsen
Am Sonntag, 23. November 2014, 16:46:07 schrieb Darshit Shah:
 I've often seen people asking for help compiling Wget because a file, css.c
 cannot be found. This occurs because Flex is required to generate that file.
 However, configure does not attempt to check if flex is installed on the
 system.
 
 I believe that even if a dependency is documented in the README files, the
 configure script **MUST** check for it before continuing. Hence, I'd like to
 push this patch which checks for the existence of flex before continuing.

Good idea (a wasn't even aware of that missing check).

But why not simply using AC_PROG_LEX ?

See
https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Particular-Programs.html

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Disable assertions by default

2014-11-22 Thread Tim Rühsen
Am Samstag, 22. November 2014, 09:55:23 schrieb Darshit Shah:
 On 11/21, Tim Rühsen wrote:
  Let's get this patch through first and others to handle the old
  assertions
  can flow in over the next week.
 
 Yes, looks good to me. Go push it.
 
 More comments below.
 
 Tim
 
 On Friday 21 November 2014 15:46:36 Darshit Shah wrote:
  On 11/21, Tim Rühsen wrote:
  On Friday 21 November 2014 13:19:18 Darshit Shah wrote:
   On 11/20, Ángel González wrote:
   On 20/11/14 15:29, Darshit Shah wrote:
   --- a/src/progress.c
   +++ b/src/progress.c
   @@ -992,6 +992,7 @@ create_image (struct bar_progress *bp, double
   dl_total_time, bool done)
   
 {
 
   int percentage = 100.0 * size / bp-total_length;
   assert (percentage= 100);
   
   +  assert (false);
   
   if (percentage  100)
   
 sprintf (p, %3d%%, percentage);
   
   -- 2.1.3
   
   Surely you didn't want to include this :)
   
   Shit! No, that was meant to be for my own debugging purposes. I was
   trying
   to see if I could induce an assertion failure. Good thing the patch
   goes
   through review first.
   
   I've fixed this in the attached patch.
  
  Just a comment.
  In case the assertions are disabled, there still should be a check and a
  WARNING message. It should inform the user that something weird happened
  and the mailing list should be informed. Wget should try to continue if
  possible. We just had the report that an assertion occurred after hours
  (and many GB) of downloading and Wget just stopped... It was just one
  file
  out of thousands that would have been skipped...
  
  I agree. We should add more logging and a warning message for when a file
  cannot be downloaded. And for such recursive cases, the same information
  should be displayed at the end of the operation too. This is currently
  missing.
  
  IMHO. we should have something like this ASAP. And having this, we also
  might get rid of assertions completely. That could make this patch
  obsolete.
  
  I think we should not get rid of assertions. Some of the assertions, like
  the one at init.c:819 or progress.c:1180 are not for handling run-time
  errors but for intimating future developers about a logical error
  immediately. Such checks need to remain in the development code, but is
  worthless in production code.
 
 How can you make sure that a developer runs into the assert at init.c:819 ?
 I guess the only sure way to check the 'commands' array is by calling
 test_commands_sorted(). Only a 'make check' does it.
 
 I can't recollect how right now. But I remember triggering that assertion a
 large number of times when I was adding a new option to Wget to for the
 first time. While coming to grips with the code base, I'd often make a
 mistake somewhere in one of the three different places which you have to
 edit to add a new option. This assertion would always trigger and let me
 know that I'm doing something very wrong.
 
  Which is why I believe that assertions should remain. Just not all the
  ones
  we currently have.
 
 Some assertions surely *can* remain. What I try to say is: we do not need
 them. A well thought of message / action can always replace an assertion.
 
 A well thought of message replaces all assertions that need to remain in
 production code. Some assertions are meant to make the code fail instantly
 when some logic doesn't add up. For example the assertion I added at
 progress.c:1180 should never be available on end-user systems. It makes
 Wget fail instantly in certain scenarios when it would otherwise just
 continue downloading. However, as a developer, I'm interested in knowing
 immediately when that assertion fails. Because it implies that the length
 of the progress bar being drawn exceeded the size of my screen. This means
 there's an error in the logic in create_image() which needs to be debugged.
 
 Such are the scenarios where a developer wants an assertion, not a debug
 message. Because I *want* Wget to crash. But in a lot of cases in the Wget
 codebase, assertions have been used instead of error messages. I've been
 looking at those recently and over the next week will try to remove quite a
 few.

Yeah, that sounds good.

If a certain assertion is a real help (and thus really makes sense) for us, of 
course let's keep it.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Change testenv/Test-auth-both.py from XFAIL to a normal test

2014-11-22 Thread Tim Rühsen
Am Freitag, 21. November 2014, 21:13:45 schrieb Darshit Shah:
 Thanking You,
 Darshit Shah
 Sent from mobile device. Please excuse my brevity
 
 On 21-Nov-2014 8:45 pm, Tim Ruehsen tim.rueh...@gmx.de wrote:
  I had two issues with the above mentioned test.
  
  1. XFAIL is not common to people - we had some confusion on the mailing
 
 list.
 
 Xfail is standard parlance for expected failures. This is also documented
 in the readme file. Xfail is not something we introduced but is available
 in autotools as a standard feature.
 
  2. XFAIL is true for a test even if it fails out of *any* reason.
  Example: When testing on a virtual machine without python3, 'make check'
 
 still
 
  happily reports XFAIL: 1 instead of report failure of all tests.
 
 This specific issue should be handled in the configure file. I'll try and
 hack it together tomorrow.
 
 I'm against this patch, since currently make check reports exactly as it
 should. The test is expected to fail. I do not know of any scenario where
 this particular test will fail for unexpected reasons. What you describe
 occurs when all tests fail.
 
 Let's keep the expected failures since it is a reminder of features that we
 currently lack.

Darshit, that is something different I wasn't aware of.
You say XFAIL is like a TODO list... well ok.
In this case there should be a (wishlist) bug and it should be referred to 
within the test source code. Maybe you can add a description to make clear 
what is going on and what is missing in Wget. With that information i could go 
and implement it.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Fix various compiler warnings

2014-11-22 Thread Tim Rühsen
Am Samstag, 22. November 2014, 14:52:14 schrieb Darshit Shah:
 This is a new set pf patches to eliminate some more compiler warnings.
 
 [PATCH 1/8] Mark unused paramter in utils.c
 [PATCH 2/8] Add extern declaration for version.c strings
 [PATCH 3/8] Fix missing extern declaration error for build_info.pl
 [PATCH 4/8] Declare extern numurls in common header
 [PATCH 5/8] Make extern declaration for program_name
 [PATCH 6/8] Add extern declaration for program_arsgstring
 [PATCH 7/8] Remove defensive assert in cookies.c
 [PATCH 8/8] Supplement logical assumption assert with error message
 
 Patches 1 through 6 eliminate compiler warnings. I've tested these patches
 in a fresh clone of Wget, and they seem to work without any of the issues
 that the patches I sent yesterday were suffering from.
 
 In patch 7, I've eliminated what looked like a purely defensive assert and
 added a conditional to check and print an error message. In case the cookie
 is not found, Wget can continue working.
 
 In patch 8 however, the assert is for identifying a logical issue in the
 code. Hence, I've retained it. And added a call to abort() after the error
 messages are printed out.
 
 Patches 7 and 8 are mostly meant to be examples of how we could handle the
 various assert statements in the codebase.


Thanks Darshit.

After applying your patch I got:

host.c: In function 'address_list_set_faulty':
host.c:156:7: warning: format '%s' expects argument of type 'char *', but 
argument 4 has type 'int' [-Wformat=]
   logprintf (LOG_ALWAYS, index: %d\nal-faulty: %s\n, index, al-
faulty);

It's abvious:
+  logprintf (LOG_ALWAYS, index: %d\nal-faulty: %s\n, index, al-
faulty);


Please feel free to push it after changing the above.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Change testenv/Test-auth-both.py from XFAIL to a normal test

2014-11-22 Thread Tim Rühsen
Am Samstag, 22. November 2014, 16:24:18 schrieb Darshit Shah:
 Another reason why I never got around to implementing this feature is that
 it is required by almost no one. The issue at hand is that when a Server
 responds with two possible authentication methods, the client is expected
 to choose the strongest one it knows. Instead Wget chooses the first one it
 knows. This violates the RFC and hence I marked it up as a bug. I'll
 probably add all this information into the test file in a while and push
 it.

I just implemented this feature in (selecting the strongest auth method).

But the HTTP test server offers both (Digest,Basic) within a single WWW-
Authenticate line. The ABNF in RFC2616 does not allow this:

3.2.1 The WWW-Authenticate Response Header

   If a server receives a request for an access-protected object, and an
   acceptable Authorization header is not sent, the server responds with
   a 401 Unauthorized status code, and a WWW-Authenticate header as
   per the framework defined above, which for the digest scheme is
   utilized as follows:

  challenge=  Digest digest-challenge

  digest-challenge  = 1#( realm | [ domain ] | nonce |
  [ opaque ] |[ stale ] | [ algorithm ] |
  [ qop-options ] | [auth-param] )


  domain= domain =  URI ( 1*SP URI ) 
  URI   = absoluteURI | abs_path
  nonce = nonce = nonce-value
  nonce-value   = quoted-string
  opaque= opaque = quoted-string
  stale = stale = ( true | false )
  algorithm = algorithm = ( MD5 | MD5-sess |
   token )
  qop-options   = qop =  1#qop-value 
  qop-value = auth | auth-int | token


I knowledge is, that the server has to send two lines of WWW-Authenticate to 
offer two authentication methods. Maybe I am wrong, but I would like to know 
from where you got further information. Or is it just a mistake ?

Example from Test HTTP server:
WWW-Authenticate: BasIc realm=Wget-Test, DIgest realm=Test, 
nonce=f07e391eb19dfb441f191f5de7ba687f, 
opaque=548c574974e749c0cfae06302b9e559b, qop=auth

Don't start to fix the test server, I have it fixed and just await your answer.
 
Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Change testenv/Test-auth-both.py from XFAIL to a normal test

2014-11-22 Thread Tim Rühsen
Am Samstag, 22. November 2014, 18:56:43 schrieb Tim Rühsen:
 But the HTTP test server offers both (Digest,Basic) within a single WWW-
 Authenticate line. The ABNF in RFC2616 does not allow this:

Sorry, I meant RFC2617.

And found in RFC2616 4.2

   Multiple message-header fields with the same field-name MAY be
   present in a message if and only if the entire field-value for that
   header field is defined as a comma-separated list [i.e., #(values)].
   It MUST be possible to combine the multiple header fields into one
   field-name: field-value pair, without changing the semantics of the
   message, by appending each subsequent field-value to the first, each
   separated by a comma. The order in which header fields with the same
   field-name are received is therefore significant to the
   interpretation of the combined field value, and thus a proxy MUST NOT
   change the order of these field values when a message is forwarded.

RFC2617 does not have an explicit #(values) / comma-separated list defined for 
'Basic'. But maybe such a line would be valid:

WWW-Authenticate: BasIc realm=Wget-Test, DIgest realm=Test, 
nonce=f07e391eb19dfb441f191f5de7ba687f, 
opaque=548c574974e749c0cfae06302b9e559b, qop=auth

So the questions are:
1. Is it allowed or not ?
2. Should Wget support it or not ? (which depends on real life occurences)

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Change testenv/Test-auth-both.py from XFAIL to a normal test

2014-11-22 Thread Tim Rühsen
Am Samstag, 22. November 2014, 23:53:58 schrieb Darshit Shah:
 Multiple challenges in a single header are allowed. I had to hack a
 workaround in the Test suite explicitly to support this behaviour.
 
 I quote RFC 2616, sec. 14.47
 
 The WWW-Authenticate response-header field MUST be included in 401
 (Unauthorized) response messages. The field value consists of at least one
 challenge that indicates the authentication scheme(s) and parameters
 applicable to the Request-URI.
 
 WWW-Authenticate = WWW-Authenticate : 1#challenge

Yes, just in this second I found an example in RFC7235 4.1:

For instance:

 WWW-Authenticate: Newauth realm=apps, type=1,
   title=Login to \apps\, Basic realm=simple

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Let Wget select strongest auth challenge

2014-11-22 Thread Tim Rühsen
Am Samstag, 22. November 2014, 16:24:18 schrieb Darshit Shah:
 Another reason why I never got around to implementing this feature is that
 it is required by almost no one. The issue at hand is that when a Server
 responds with two possible authentication methods, the client is expected
 to choose the strongest one it knows. Instead Wget chooses the first one it
 knows. This violates the RFC and hence I marked it up as a bug. I'll
 probably add all this information into the test file in a while and push
 it.

Hi Darshit,

I just made up a patch to

1. Parse multiple challenges from WWW-Authenticate
2. Select the strongest auth scheme

Please have a look at it.

Tim
From a4c9939376cd8327e55111af3b190dd2e91f5746 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Sat, 22 Nov 2014 22:00:28 +0100
Subject: [PATCH] Select most secure auth challenge

---
 src/http.c | 67 --
 testenv/server/http/http_server.py |  2 +-
 2 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/src/http.c b/src/http.c
index 87ceffd..832707d 100644
--- a/src/http.c
+++ b/src/http.c
@@ -2380,26 +2380,64 @@ read_header:
  the value negotiate, and other(s) with data.  Loop over
  all the occurrences and pick the one we recognize.  */
   int wapos;
+  char *buf;
+  const char *www_authenticate = NULL;
   const char *wabeg, *waend;
-  char *www_authenticate = NULL;
-  for (wapos = 0;
-   (wapos = resp_header_locate (resp, WWW-Authenticate, wapos,
+  const char *digest = NULL, *basic = NULL, *ntlm = NULL;
+  for (wapos = 0; !ntlm
+(wapos = resp_header_locate (resp, WWW-Authenticate, wapos,
 wabeg, waend)) != -1;
++wapos)
-if (known_authentication_scheme_p (wabeg, waend))
-  {
-BOUNDED_TO_ALLOCA (wabeg, waend, www_authenticate);
-break;
-  }
+{
+  param_token name, value;
+
+  BOUNDED_TO_ALLOCA (wabeg, waend, buf);
+  www_authenticate = buf;
+
+  for (;!ntlm;)
+{
+  /* extract the auth-scheme */
+  while (c_isspace (*www_authenticate)) www_authenticate++;
+  name.e = name.b = www_authenticate;
+  while (*name.e  !c_isspace (*name.e)) name.e++;
+
+  if (name.b == name.e)
+break;
+
+  DEBUGP ((Auth scheme found '%.*s'\n, (int) (name.e - name.b), name.b));
+
+  if (known_authentication_scheme_p (name.b, name.e))
+{
+  if (BEGINS_WITH (name.b, NTLM))
+{
+  ntlm = name.b;
+  break; // most secure
+}
+  else if (!digest  BEGINS_WITH (name.b, Digest))
+digest = name.b;
+  else if (!basic  BEGINS_WITH (name.b, Basic))
+basic = name.b;
+}
+
+  /* now advance over the auth-params */
+  www_authenticate = name.e;
+  DEBUGP ((Auth param list '%s'\n, www_authenticate));
+  while (extract_param (www_authenticate, name, value, ',', NULL)  name.b  value.b)
+{
+  DEBUGP ((Auth param %.*s=%.*s\n,
+   (int) (name.e - name.b), name.b, (int) (value.e - value.b), value.b));
+}
+}
+}

-  if (!www_authenticate)
+  if (!basic  !digest  !ntlm)
 {
   /* If the authentication header is missing or
  unrecognized, there's no sense in retrying.  */
   logputs (LOG_NOTQUIET, _(Unknown authentication scheme.\n));
 }
   else if (!basic_auth_finished
-   || !BEGINS_WITH (www_authenticate, Basic))
+   || !basic)
 {
   char *pth = url_full_path (u);
   const char *value;
@@ -2407,6 +2445,15 @@ read_header:
   auth_stat = xmalloc (sizeof (uerr_t));
   *auth_stat = RETROK;

+  if (ntlm)
+www_authenticate = ntlm;
+  else if (digest)
+www_authenticate = digest;
+  else
+www_authenticate = basic;
+
+  logprintf (LOG_NOTQUIET, _(Authentication selected: %s\n), www_authenticate);
+
   value =  create_authorization_line (www_authenticate,
   user, passwd,
   request_method (req),
diff --git a/testenv/server/http/http_server.py b/testenv/server/http/http_server.py

Re: [Bug-wget] Remove most warnings for missing extern

2014-11-21 Thread Tim Rühsen
Hi Darshit,

the same problem on a different machine (though also Debian unstable).

I also ran
  make maintainer-clean
  ./bootstrap
  ./configure
  make

Maybe something is missing in your patches ?
Or a make bug ? (make 4.0-8)


 IMHO, version.c and version.h has wrong dependencies in Makefile.am.
 They depend on changed wget_SOURCES, which don't have to change.
 
 I think version.c and version.h have the right dependencies. I'm not sure
 about the dependency on ../lib/libgnu.a, but the dependency on wget_SOURCES
 is perfect.

../lib/libgnu.a IMHO is wrong.

wget_SOURCES includes build_info.c which depends on $(srcdir)Makefile.am and 
$(srcdir)/build_info.c.in.

Using these lines and everything works:

version.h: version.c
...
version.c: build_info.c
...

But version.h is static, so it should go into git and not being created.

Tim 

Am Freitag, 21. November 2014, 23:24:49 schrieb Darshit Shah:
 On 11/21, Tim Rühsen wrote:
 On Friday 21 November 2014 19:58:46 Darshit Shah wrote:
  On 11/21, Tim Rühsen wrote:
  On Friday 21 November 2014 17:13:22 Darshit Shah wrote:
   Clang provides some warnings for missing extern declarations for
   non-static
   variables. The following two patches clear most of them. I can
   currently
   see only more such warning which is caused by build_info.c. To fix
   this,
   someone will have to hack on the build_info.px perl script.
  
  After ./boostrap
  ./configure
  
  I get
  
  main.c:58:21: fatal error: version.h: No such file or directory
  
   #include version.h
  
  I just ran:
  make maintainer-clean
  ./bootstrap
  ./configure
  make
  
  and it works for me without any issues. Maybe you could try running a
  simple make clean first? If it still doesn't work, I'll try looking into
  it again. But since I can't reproduce it, its hard to guess where the
  error would be.
 Try to remove src/version.c and src/version.h. Than perform a 'make'.
 
 I confirmed that both version.c and version.h didn't exist when I ran make.
 
 I also tried again by simply by deleting version.* before running make
 again. It still compiles perfectly for me.
 
 IMHO, version.c and version.h has wrong dependencies in Makefile.am.
 They depend on changed wget_SOURCES, which don't have to change.
 
 I think version.c and version.h have the right dependencies. I'm not sure
 about the dependency on ../lib/libgnu.a, but the dependency on wget_SOURCES
 is perfect.
 
 Let me start explaining my reasons with a question, when does version.c need
 to be re-created?
 
 Version.c needs regeneration when any of its component strings have a change
 in value.
 The $COMPILE string changes when the libraries being used change, or more
 frequently, when the CFLAGS variable changes.
 The link_string is dependent on the AM_CFLAGS, CFLAGS, LDFLAGS, AM_LDFLAGS
 and LIBS variables.
 One last part that the version data is dependent on is the current patch
 level of Wget.
 
 Now, within a Makefile, how do we determine these dependencies? Now, I'm not
 sure if commits that solely touch the non-C source files cause version.c to
 be regenerated. But all other commits will change files in wget_SOURCES.
 Hence forcing a change in version.c
 Similarly, any change in the CFLAGS, LDFLAGS or LIBS variables will also
 cause files in the wget_SOURCES variable to be recompiled. As a result of
 that, version.c will be marked for re-compilation since its dependencies
 have changed. This is exactly how we want it to behave.
 
 I could make it work by using
 version.h: $(srcdir)/Makefile.am
 ...
 version.c: version.h
 
 This at least worked (after a ./config.status), but it may be wrong.
 What exactly has to change so that we need to rebuild version.[ch]  ?
 It's not the Wget sources !
 
 In general, version.h doesn't need to be rebuilt. I think the best way is to
 explicitly write it and commit version.h into the source tree. It doesn't
 depend on any values that are generated at runtime. If everyone agrees on
 this setup, I'll gladly change the patch to simply statically include
 version.h. This does seem like a better and more efficient idea to me.
 
 version.c on the other hand, does indeed need to be re-generated every time
 the wget_SOURCES are regenerated. Because it marks a change in either the
 patch level, or in one of the compilation variables which is other
 impossible to track from a Makefile.
 
 Tim
 
  Maybe you can also fix these warnings them before pushing:
  
  host.c: In function 'address_list_set_faulty':
  host.c:148:55: warning: unused parameter 'index' [-Wunused-parameter]
  
   address_list_set_faulty (struct address_list *al, int index)
  
  cookies.c: In function 'discard_matching_cookie':
  cookies.c:303:15: warning: variable 'res' set but not used
  [-Wunused-but-set- variable]
  
 int res;
  
  It seems these warnings aren't enabled by -Wall and -Wextra on Clang. In
  fact -Wunused-but-set-variable is GCC specific only. Which is why I never
  saw these warnings in my output

Re: [Bug-wget] wcat?

2014-11-20 Thread Tim Rühsen
Am Donnerstag, 20. November 2014, 17:22:24 schrieb Dagobert Michelsen:
 Hi,
 
  Am 20.11.2014 um 08:55 schrieb Ángel González keis...@gmail.com:
  
  On 20/11/14 07:34, Darshit Shah wrote:
  And talking about legalities, I'm hoping you already have signed the
  assignment papers because otherwise that's even more work, before we can
  add this to the source. :-) 
  Come on, that's not needed for trivial changes :)
  The given shell script is a perfect example of trivial patch.
  
  And regarding the required options, I would keep the
  parameter-checking cruft to a minimum.
 
 I would consider this worse than not including it because if you don’t do it
 right you get all sorts of problems. My main concern is the it should
 hardcode to $(bindir)/wget as passed to configure or e.g. /opt/csw/bin/wcat
 with /opt/csw/bin not in the path would result in invoking the wrong (or
 none at all) wget. This requires substitution during configure and not
 being put in contrib.

From what I read here and in this thread, having a script seems to be more 
work than just building it into the Wget code.

The Unix way: wget checks if the name of the executable is 'wcat'.
Doing so after all options have been processed, 'wcat' could easily check if
e.g. -O has been given on the command line (or the config file) and stop with 
an 
error message (or whatever is appropriate).

It's easier to maintain for us developers. I guess.

Just an idea. I am not going too deep into this discussion, too much work left 
for next two weeks ;-)

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] *BROKEN* Configure cleanup

2014-11-20 Thread Tim Rühsen
Am Donnerstag, 20. November 2014, 15:42:22 schrieb Tim Ruehsen:
 On Thursday 20 November 2014 15:36:57 Darshit Shah wrote:
  On 11/20, Darshit Shah wrote:
  How does this look for another attempt at the configure file?
  
  Here's another patch that uses pkg-config to check for libpsl
  
  Also, if there are no objections, I'll also push the patch that adds
  -Wextra to the default CFLAGS.
 
 Sorry, you forgot
 
 # correct $LIBPSL_LIBS (in libpsl = 0.6.0)
 AS_IF([test x$LIBPSL_LIBS = x-llibpsl ], [LIBPSL_LIBS=-lpsl])
 
 Missing this, ./configure will not work any more (you called it spurious
 errors).
 
 Please add it ASAP, just can't build Wget right now.
 
 My code example was:
 
 AC_ARG_WITH(libpsl, AS_HELP_STRING([--without-libpsl], [disable support for
 libpsl cookie checking]), with_libpsl=$withval, with_libpsl=yes)
 AS_IF([test x$with_libpsl != xno], [
   PKG_CHECK_MODULES([LIBPSL], libpsl, [
 with_libpsl=yes
 # correct $LIBPSL_LIBS (in libpsl = 0.6.0)
 AS_IF([test x$LIBPSL_LIBS = x-llibpsl ], [LIBPSL_LIBS=-lpsl])
 LIBS=$LIBPSL_LIBS $LIBS
 CFLAGS=$LIBPSL_CFLAGS $CFLAGS
 AC_DEFINE([WITH_LIBPSL], [1], [PSL support enabled])
   ], [
 AC_SEARCH_LIBS(psl_builtin, psl,
   [with_libpsl=yes; AC_DEFINE([WITH_LIBPSL], [1], [PSL support
 enabled])], [with_libpsl=no;  AC_MSG_WARN(*** libpsl was not found.
 Fallback to builtin cookie checking.)])
   ])
 ])
 
 Tim

I had to fix it here anyway for testing.
I pushed the fix.

Tim

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Fixing C89 warnings

2014-11-20 Thread Tim Rühsen
Am Donnerstag, 20. November 2014, 15:56:00 schrieb Gisle Vanem:
 Tim Ruehsen wrote:
  [1]:
  #else
  # define count_cols(mbs) ((int)(strlen(mbs)))
  # define cols_to_bytes(mbs, cols, *ncols) do {  \
  
  *ncols = cols;  \
  bytes = cols;   \
  
  }while (0)
  #endif
  
  (I forgot to add 'HAVE_WCWIDTH' and 'HAVE_MBTOWC').
  
  ? could you send a patch ? I am not sure what to fix here.
 
 FYI. the error from MSVC was:
progress.c(844) : error C2010: '*' : unexpected in macro formal parameter
 list progress.c(978) : error C2059: syntax error : 'do'
 
 Here is a patch:
 
 --- ../Git-latest/src/progress.c2014-11-20 15:39:55 +
 +++ progress.c  2014-11-20 16:44:03 +
 @@ -841,10 +841,7 @@
   }
   #else
   # define count_cols(mbs) ((int)(strlen(mbs)))
 -# define cols_to_bytes(mbs, cols, *ncols) do {  \
 -*ncols = cols;  \
 -bytes = cols;   \
 -}while (0)
 +# define cols_to_bytes(mbs, cols, ncols) *ncols = cols
   #endif
 
  Does this hold true for Win32 (WinXP 32bit) ?
  Or do we have to amend this check ?
 
 Windows since way back has supported 4 GB files. It's
 been compilers that were slow following that. Since
 MingW/MSVC have libc support for huge-files, that
 'wgint' is hardcoded to 64 bits signed. I vaguely remember
 me an Hrvoje discussed this long before you switched to
 Gnulib.

Instead of a #define to replace a function, I decided to write two small 
'dummy' functions here. It's pushed now.

I also pushed your patch/suggestion to always assume large-file for (defined 
WINDOWS) in src/build_info.c.in.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Fixing C89 warnings

2014-11-19 Thread Tim Rühsen
Am Mittwoch, 19. November 2014, 22:07:28 schrieb Darshit Shah:
 On 11/18, Tim Rühsen wrote:
 This patch fixes most C89 warnings for me (-std=c89 -pedantic) since these
 may prevent from compiling with MSVC.
 
 There are still some warnings ISO C forbids conversion of function pointer
 to object pointer type [-Wpedantic] left over. I'll care for these the
 next days. There are architectures where function pointers have a
 different size from void *. That's why this warning has a meaning.
 
 Tim
 
 From 11baace21e1fa04a92baa395f38ebad8001e9762 Mon Sep 17 00:00:00 2001
 From: Tim Ruehsen tim.rueh...@gmx.de
 Date: Tue, 18 Nov 2014 22:00:48 +0100
 Subject: [PATCH] Trivial fixes for C89 compliancy.
 
 snip
 
 diff --git a/src/gnutls.c b/src/gnutls.c
 index 87d1d0b..42201e5 100644
 --- a/src/gnutls.c
 +++ b/src/gnutls.c
 @@ -54,6 +54,10 @@ as that of the covered work.  */
 
  # include w32sock.h
  #endif
 
 +#ifdef HAVE_ALLOCA_H
 +# include alloca.h
 +#endif
 +
 
  #include host.h
  
  static int
 
 @@ -122,9 +126,10 @@ ssl_init (void)
 
while ((dent = readdir (dir)) != NULL)

  {
  
struct stat st;
 
 -  char ca_file[dirlen + strlen(dent-d_name) + 2];
 +  size_t ca_file_length = dirlen + strlen(dent-d_name) + 2;
 +  char *ca_file = alloca(ca_file_length);
 
 What happens when HAVE_ALLOCA_H is not defined? The above code will attempt
 to call a function that does not exist and Wget will crash.
 
 I think, we can simply malloc / free() these. Sure alloca is more
 convenient, but we need a valid fallback for when it is not available.

There are systems without alloca.h header file. Thus the check.
(I just had an error report for libpsl with this issue.)

I do not know systems without alloca() function.
Wget sources already use alloca on several places.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Fix possible issues running in a turkish locale

2014-11-19 Thread Tim Rühsen
Am Mittwoch, 19. November 2014, 22:21:12 schrieb Darshit Shah:
 On 11/18, Tim Rühsen wrote:
 I amended three tests to fail when run with turkish locale.
 I fixed these issues (using c_strcasecmp/c_strncasecmp) and also replaced
 strcasecmp/strncasecmp by c_strcasecmp/c_strncasecmp at places where we
 definitely want a ASCII comparison instead of a locale dependent one.
 
 There are still some places left where we use strcasecmp/strncasecmp, e.g.
 domain/host and filename comparisons.
 
 Please have a look...
 
 Tim
 
 In cookies.c, the header file is added on the same line as the ^L Page break
 character. Please add the file before the Page Break.
 
 I'm assuming your editor didn't show you that character, but git diff does.
 Try to enable such invisible characters in your editor.

No, you are right. Vi shows it, my IDE does not.
Dumb question: I never saw ^L in source code. What is it good for ? Printing ?


 Apart from that, it looks like a good patch. Please push it when you can

Thanks, I'll have some time in a few hours.

Tim


signature.asc
Description: This is a digitally signed message part.


<    2   3   4   5   6   7   8   9   >