Re: [Bug-wget] Bug on latest wget (1.3.14)

2012-03-29 Thread Tim Rühsen
In url.c / url_file_name() an empty query is not used for the filename:

  /* Append ?query to the file name. */
  u_query = u-query  *u-query ? u-query : NULL;

Should it be patched here ?

Mit freundlichen Grüßen

Tim Rühsen

Am Thursday 29 March 2012 schrieb Tim Ruehsen:
 Just some more infos:
 It is reproducible with the latest trunk version.
 
 The problem seems to be empty queries like in main.css (original):
 src: url('/TLBB/fbinir/mult/stagsans-book-webfont.eot');^M
 src: url('/TLBB/fbinir/mult/stagsans-book-webfont.eot?#iefix')
 format('embedded-opentype'),^M
 
 BTW, empty queries are absolutely legal (rfc 2396: query = *uric).
 
 The downloader downloads
   /TLBB/fbinir/mult/stagsans-book-webfont.eot
 and
   /TLBB/fbinir/mult/stagsans-book-webfont.eot?
 and saves them both into the same file
 Saving to: 'accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-book-
 webfont.eot'
 
 I assume the hashmaps 'dl_url_file_map' and 'dl_file_url_map' are now out
 of sync.
 Now the scanning can't find a local file for the first download and thus
 does not translate it to local name but to a complete name.
 
 Here is some debug output where you can see it (look for 'complete', which
 should be local):
 
 Scanning accionistaseinversores.bbva.com/TLBB/fbinir/css/main.css?v=1 (from
 http://accionistaseinversores.bbva.com/TLBB/fbinir/css/main.css?v=1)
 Loaded accionistaseinversores.bbva.com/TLBB/fbinir/css/main.css?v=1 (size
 99449).
 accionistaseinversores.bbva.com/TLBB/fbinir/css/main.css?v=1:
 merge('http://accionistaseinversores.bbva.com/TLBB/fbinir/css/main.css?v=1'
 , '/TLBB/fbinir/mult/stagsans-book-webfont.eot') -
 http://accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-book-
 webfont.eot
 appending
 'http://accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-
 book-webfont.eot' to urlpos.
 Found URI: [url('/TLBB/fbinir/mult/stagsans-book-webfont.eot')] at 2404
 [/TLBB/fbinir/mult/stagsans-book-webfont.eot]
 accionistaseinversores.bbva.com/TLBB/fbinir/css/main.css?v=1:
 merge('http://accionistaseinversores.bbva.com/TLBB/fbinir/css/main.css?v=1'
 , '/TLBB/fbinir/mult/stagsans-book-webfont.eot?#iefix') -
 http://accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-book-
 webfont.eot?#iefix
 appending
 'http://accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-
 book-webfont.eot?' to urlpos.
 Found URI: [url('/TLBB/fbinir/mult/stagsans-book-webfont.eot?#iefix')] at
 2462 [/TLBB/fbinir/mult/stagsans-book-webfont.eot?#iefix]
 
 will convert url
 http://accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-book-
 webfont.eot to complete
 URI encoding = 'ANSI_X3.4-1968'
 will convert url
 http://accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-book-
 webfont.eot? to local
 accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-book-webfont.eot
 URI encoding = 'ANSI_X3.4-1968'
 
 Tim Ruehsen
 
 Am Thursday 29 March 2012 schrieb Alejandro Supu:
  Hi,
  
  I have found a bug on the latest version of the http client, wget 1.3.14
  
  This is how to reproduce it:
  
  If we save the page:
  http://accionistaseinversores.bbva.com/TLBB/tlbb/bbvair/esp/index.jsp
  with the following parameters: wget -k -p
  http://accionistaseinversores.bbva.com/TLBB/tlbb/bbvair/esp/index.jsp
  
  On the saved main.css file
  (\accionistaseinversores.bbva.com\TLBB\fbinir\css), there are files that
  point to the remote files instead of the saved ones! For example, on line
  57, 68 and 79, it points to
  http://accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-light-we
  b font.eot instead of ../mult/stagsans-book-webfont.eot and this file was
  saved to local... There are other files with the same behaviour.
  
  If you search the string http within the CSS file, you will find all
  the pointed files to remote instead of the local SAVED ones.
  
  Please, tell me anything related to this bug or when it will be
  corrected.
  
  THANKS!
-- 



Re: [Bug-wget] (Patch) Bug on latest wget (1.3.14)

2012-03-30 Thread Tim Rühsen
Hello Alejandro,

here is a patch that fixes the issue with empty HTTP queries.

But the website has two files that can't be loaded (404 Not found). These 
files won't be translated to local filenames. This is a correct behaviour, 
since these files do not exist locally.

Guiseppe, put you in CC since I am not shure if you read all discussions in 
the list.

Tim

Am Thursday 29 March 2012 schrieb Alejandro Supu:
 Hi,
 
 I have found a bug on the latest version of the http client, wget 1.3.14
 
 This is how to reproduce it:
 
 If we save the page:
 http://accionistaseinversores.bbva.com/TLBB/tlbb/bbvair/esp/index.jsp with
 the following parameters: wget -k -p
 http://accionistaseinversores.bbva.com/TLBB/tlbb/bbvair/esp/index.jsp
 
 On the saved main.css file
 (\accionistaseinversores.bbva.com\TLBB\fbinir\css), there are files that
 point to the remote files instead of the saved ones! For example, on line
 57, 68 and 79, it points to
 http://accionistaseinversores.bbva.com/TLBB/fbinir/mult/stagsans-light-web
 font.eot instead of ../mult/stagsans-book-webfont.eot and this file was
 saved to local... There are other files with the same behaviour.
 
 If you search the string http within the CSS file, you will find all the
 pointed files to remote instead of the local SAVED ones.
 
 Please, tell me anything related to this bug or when it will be corrected.
 
 THANKS!
-- 
OMS Open Media System GmbH
Holzdamm 40
20099 Hamburg
Fon +49-40-238878-40
Fax +49-40-238878-99
Email tim.rueh...@openmediasystem.de
Sitz und Registergericht Hamburg
HRB 57616
=== modified file 'src/ChangeLog'
--- src/ChangeLog	2012-03-25 15:49:55 +
+++ src/ChangeLog	2012-03-30 09:18:54 +
@@ -1,3 +1,7 @@
+2012-03-30  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* url.c: use empty query in local filenames
+
 2012-03-25  Giuseppe Scrivano  gscriv...@gnu.org
 
 	* utils.c: Include sys/ioctl.h.

=== modified file 'src/url.c'
--- src/url.c	2011-01-01 12:19:37 +
+++ src/url.c	2012-03-30 09:14:56 +
@@ -1502,7 +1502,7 @@
 {
   struct growable fnres;/* stands for file name result */
 
-  const char *u_file, *u_query;
+  const char *u_file;
   char *fname, *unique;
   char *index_filename = index.html; /* The default index file is index.html */
 
@@ -1561,12 +1561,11 @@
   u_file = *u-file ? u-file : index_filename;
   append_uri_pathel (u_file, u_file + strlen (u_file), false, fnres);
 
-  /* Append ?query to the file name. */
-  u_query = u-query  *u-query ? u-query : NULL;
-  if (u_query)
+  /* Append ?query to the file name, even if empty */
+  if (u-query)
 	{
 	  append_char (FN_QUERY_SEP, fnres);
-	  append_uri_pathel (u_query, u_query + strlen (u_query),
+	  append_uri_pathel (u-query, u-query + strlen (u-query),
 			 true, fnres);
 	}
 }



Re: [Bug-wget] Supercookie issues

2012-11-09 Thread Tim Rühsen
Am Freitag, 9. November 2012 schrieb Ángel González:
 On 09/11/12 16:27, Tim Ruehsen wrote:
  While implementing cookies for Mget (https://github.com/rockdaboot/mget) 
  conforming to RFC 6265, I stubled over http://publicsuffix.org/ (Mozilla 
  Public Suffix List).
 
  Looking at Wget sources discovers, that there is just a very incomplete 
check 
  for public suffixes. That implies a very severe vulnerability to 
supercookie 
  attacks when cookies are switched on (they are by default).
 
  Since Mget was ment as a Wget2 candidate (all or parts of the sources), 
please 
  feel free to copy the needed sourcecode from it (see cookie.c/cookie.h and 
  tests/test.c for test routines). Right now, I just don't have the time to 
do 
  the work, but of course I will answer your questions.
 
  ShouldN't there be a warning within the docs / man pages.
  What do you think ?
 
  Regards, Tim

 I see little reason for concern about supercookies on wget given that it
 is unlikely
 to use it for different tasks in the same invocation, and cookies are not
 automatically loaded/saved accross invocations.
 And for having a supercookie passed in the same run (eg. one website
 redirected
 to the other), they are probably cooperating domains, so the supercookie
 doesn't
 add much information.
 You would need to be using --load-cookies and --save-cookies to allow such
 supercookie spying.

That's what i use wget. Logging in on a website and accessing my private data.
One could do that even with one call to wget, so --load/save-cookies is not 
even needed.

 The worst case is probably if the cookie file was shared with a browser,
 or it was
 taken from a browser (with many cookies unrelated for what is intended) and
 passed to wget with --load-cookies and wget sent more cookies than
 expected .

That is one possibility, but as i said, you won't need --load/save-cookies to 
be vulnerable. 

 Although not too important, it should be fixed, of course. The Mozilla
 Public Suffix
 List isn't very simple for reuse, its format is designed for how they
 use it internally.

No, the format is easy, clear, understandable und well documented.

Maybe I didn't say understandable in my first post:
The code is there and tested
You just have to call cookie_load_public_suffixes(filename), call repeatedly 
cookie_suffix_match(domain). Very easy.

Regards, Tim



Re: [Bug-wget] Supercookie issues

2012-11-11 Thread Tim Rühsen
Am Freitag, 9. November 2012 schrieb Ángel González:
 On 09/11/12 20:17, Tim Rühsen wrote:
  Am Freitag, 9. November 2012 schrieb Ángel González:
  I see little reason for concern about supercookies on wget given that it
  is unlikely
  to use it for different tasks in the same invocation, and cookies are 
not
  automatically loaded/saved accross invocations.
  And for having a supercookie passed in the same run (eg. one website
  redirected
  to the other), they are probably cooperating domains, so the supercookie
  doesn't
  add much information.
  You would need to be using --load-cookies and --save-cookies to allow 
such
  supercookie spying.
  That's what i use wget. Logging in on a website and accessing my private 
data.
  One could do that even with one call to wget, so --load/save-cookies is 
not 
  even needed.
 
  The worst case is probably if the cookie file was shared with a browser,
  or it was
  taken from a browser (with many cookies unrelated for what is intended) 
and
  passed to wget with --load-cookies and wget sent more cookies than
  expected .
  That is one possibility, but as i said, you won't need --load/save-cookies 
to 
  be vulnerable. 

 Can you provide an example on how you are using wget, that leads you to
 think
 you would be vulnerable?
 I think you may be misunderstanding something (but perhaps it turns out
 it's me
 who is wrong!).
 
 Maybe it is most dangerous if you have load_cookies and save_cookies set
 in your
 wgetrc, instead of passing an appropiate one for each task.
 
 Even then, the case you mention of stealing your private data is quite
 hard, as
 the to-be-stolen website is very unlikely to use a vulnerable domain
 (ie. not caught
 by the general rules used by wget).

Thanks to your questions I spent more time into the issue.

Well, at least the expression supercookie is misleading. I got it from 
http://publicsuffix.org, but meanwhile I know supercookie in general means 
DOM/Web storage - another kind of persistently storing server infos on the 
client side. But that is another issue, not relevant for wget right now.

In general: storing cookies with a public domain attribute *may* leak your 
privacy. The normal user is not aware of that fact, so the used client 
software should take care of that. Most browsers do that by using the PSL 
(Public Suffix List). Wget is a cookie storing client and thus should use the 
PSL as well. The Mozilla PSL is - as they write - not perfect, but all we 
have.
I can't give you an example where privacy leaking really takes place. But I 
could construct / think of such cases.


  Although not too important, it should be fixed, of course. The Mozilla
  Public Suffix
  List isn't very simple for reuse, its format is designed for how they
  use it internally.
  No, the format is easy, clear, understandable und well documented.
 
  Maybe I didn't say understandable in my first post:
  The code is there and tested
  You just have to call cookie_load_public_suffixes(filename), call 
  repeatedly 
  cookie_suffix_match(domain). Very easy.
 I answered without checking, from what I remembered. Reading it now, it
 does
 seem quite straightforward to interpret the list. Although perhaps a bit
 more complex
 to do that efficiently for multiple domains.
 
 I don't know where's that function you mention, I don't see it in the
 website. Perhaps
 it belongs to your mget?

Yes. Go to https://github.com/rockdaboot/mget and click on 'cookie.c'.
Test cases can be found in tests/test.c/test_cookies()

Regards, Tim



Re: [Bug-wget] Portability to platforms without C99

2012-11-24 Thread Tim Rühsen
Am Donnerstag, 22. November 2012 schrieb Hrvoje Niksic:
 Giuseppe Scrivano gscriv...@gnu.org writes:
 
  Let's be realistic, is there any platform/system (with more than 3
  users) where C99 is a problem?
 
  Visual Studio is not a problem as there are other ways to build wget on
  Windows that don't stick us to something more than 20 years old.
 
 If Wget is no longer concerned with portability to older compilers and
 architectures, there should be an announcement so that the decision can
 at least be discussed before it's implemented.
 

That is always a good idea.

Guiseppe could also just state, that the upcoming 2.x version needs c99 and / 
or POSIX200x. Backports to C89 1.x version could easily be made by interested 
persons. Nobody would really be hurt by such a decision.

Regards, Tim



Re: [Bug-wget] Bug with GNU Wget 1.13.4, --config

2012-12-08 Thread Tim Rühsen
Am Dienstag, 4. Dezember 2012 schrieb Adrien Dumont:
 Hi,
 
 I have find a bug in GNU Wget 1.13.4 :
 
 wget $edt_url --config=$wget_config \
 --post-data=login=$edt_loginpassword=$edt_passwordaction=Connexion \
 --keep-session-cookies --save-cookies '/tmp/edt_cookies.txt' \
 -O '/dev/null' -nv -a $log
 
 is not equivalent to
 
 wget $edt_url \
 --post-data=login=$edt_loginpassword=$edt_passwordaction=Connexion \
 --keep-session-cookies --save-cookies '/tmp/edt_cookies.txt' \
 -O '/dev/null' -nv -a $log --config=$wget_config
 
 In the first case, wget runs correctly.
 
 In the second case, wget ignores --config.
 
 $wget_config is a file who contains proxy parameters.
 
 cat $wget_config
 http_proxy = http://.:@10.10.28.5:3128
 use_proxy = on
 wait = 15

Just to confirm it, Wget 1.14 suffers from the same behaviour. I am not shure, 
if it is a bug or a documented feature.

Reducing the CLI options, it turns out that the order of --config and --post-
data matters. --config after --post-data ignores the proxy settings.

Regards, Tim



Re: [Bug-wget] FeatureSpecification - HTTP compression

2012-12-09 Thread Tim Rühsen
Am Samstag, 8. Dezember 2012 schrieb 7382...@gmail.com:
 Hello
 
 I think wget should HTTP compression (Accept-Encoding: gzip, deflate). It
 would put less strain on servers being downloading from, and use less of
 their bandwidth. Is it okay to add this idea to the
 http://wget.addictivecode.org/FeatureSpecifications page? I don't know
 where on the page to add it, and thought I should check first before doing
 so in case there was a reason it isn't there
 
 Thank you for your time
 

The next Wget to come (still called Mget) already has this feature working.

https://github.com/rockdaboot/mget

It still needs some work to do, any help is appreciated.
Mainly ftp and WARC stuff.

Regards, Tim



Re: [Bug-wget] Syntax for RESTful scripting options

2013-03-08 Thread Tim Rühsen
Am Dienstag, 5. März 2013 schrieb Darshit Shah:
 Need some help with writing a test for this functionality.
 I have implemented a --method=HHTPMethod command that currently supports
 DELETE only.
 
 I would be very grateful if someone can help me with writing a test to
 ensure that this is working correctly.
 Attaching the patch file that adds this functionality.

I am not shure, that a test for --method is needed.
Do you want to see Wget generate a DELETE request if you use --method=DELETE ? 
Just use -d ;-)

But to answer your request: You have to make HTTPServer.pm accept and answer 
DELETE requests (and POST would be nice then, too). Make a copy of a .px test 
and change the options of the wget command.

[just something personal]
I just did that for POST, but I can't see any benefit to add such a test.
And perl *really* sucks to me (but hey, i am a C and Java programmer).
It took me 45 minutes to understand the perl syntax + the existing test 
environment code... that is far too much for such a simple test environment. 
In fact, after that, the actual code change took me 3 minutes. 
But I know, I forget all about perl within hours or at least days... perl is 
hack and forget and a waste of time (to me, not to everyone).
I would prefer a C test environment for a C project, having tests written in 
C.

Regards, Tim



Re: [Bug-wget] wget 1.14 possibly writing off-spec warc.gz files

2013-03-30 Thread Tim Rühsen
Am Freitag, 29. März 2013 schrieb Andy Jackson:
 When using wget 1.14 to generate warc.gz files, e.g.
 
 wget -O tempname --warc-file=output  http://example.com;
 
 the files this creates do not play back well using the Internet Archives 
 warc.gz parsers, throwing errors like 
 
 Invalid FExtra length/records. 
 
 It appears wget may be creating slightly malformed GZIP skip-length 
 fields - see 
 
 https://github.com/ukwa/warc-discovery/issues/1 
 
 for details.
 
 It's likely that we'll need to make the warc.gz parsers a bit more 
 robust, but I thought I'd mention it here in case this is 
 actually a bug in wget.
 
 Thanks for your time.
 
 Andy Jackson

Just a very quick test (before I go to bed) shows an unexpected behaviour to 
me:

$ wget -O tempname --warc-file=output  http://example.com;
results in a 5065 bytes file 'output.warc.gz'

Unzipping it and zipping it again results in a 2387 byte file.

So, for a first glimpse, it looks like Wget compresses very suboptimal.
But I won't say it is a bug before I take a deeper look... (in the next days).

Regards Tim



Re: [Bug-wget] wget ignores --mirror option on openWrt's Luci interface

2013-04-08 Thread Tim Rühsen
Am Montag, 8. April 2013 schrieb Olivier Diotte:
 Hi Giuseppe,
 
 On Sat, Apr 6, 2013 at 4:22 PM, Giuseppe Scrivano gscriv...@gnu.org wrote:
  Hi Oliver,
 
  Olivier Diotte oliv...@diotte.ca writes:
 
  The commands used are:
  wget --save-cookies cookies.txt --keep-session-cookies --post-data
  'username=rootpassword=' http://172.16.1.1/cgi-bin/luci/
  wget --load-cookies cookies.txt -m -r
  'http://172.16.1.1/cgi-
bin/luci/;stok=b5cb51685f39a8cdc67bf6e1e4871143/admin/status/iptables/'
 
  Here is an example of a tag that isn't followed:
  a style=display:block;padding-left:25px;
  href=/cgi-
bin/luci/;stok=b5cb51685f39a8cdc67bf6e1e4871143/admin/status/routes/Routes/a
 
  I couldn't reproduce it by faking a page with that a href you have
  included.
 
  Can you also pass -d to wget and attach the log?
 
  Have you already tried with a newer version of wget too?
 
 
 Attached is the log of a freshly compiled wget-1.14
 (http://ftp.gnu.org/gnu/wget/wget-1.14.tar.xz)
 The problem remains.

Hi Oliver,

in the log I can't find your mentioned .../admin/status/routes/ link.

That means, Wget's parser didn't find it.
To find out if this is a bug in Wget or in the HTML structure, please attach 
the HTML file that contains that href/link.

Regards, Tim



Re: [Bug-wget] wget ignores --mirror option on openWrt's Luci interface

2013-04-11 Thread Tim Rühsen
Hi Olivier,

Am Donnerstag, 11. April 2013 schrieb Olivier Diotte:
 On Thu, Apr 11, 2013 at 5:19 AM, Tim Ruehsen tim.rueh...@gmx.de wrote:
  Hi Olivier,
 
  I got openWRT running.
 
  And I can reproduce the problem.
 
  Wget -r seems to miss some a href URLs.
 
  I have a conference right now which might take the whole day...
 
  But maybe someone else could have a quick look at the html file.
 
  It is (attached)
  192.168.1.1/cgi-
  bin/luci/;stok=5ddf8fa64dbdd8ffb5c878e8d2339567/admin/network/index.html
 
  and it contains e.g.
  a href=/cgi-
  bin/luci/;stok=5ddf8fa64dbdd8ffb5c878e8d2339567/admin/network/routes/S
  tatic Routes/a
 
  but wget doesn't recognize/download that URL.
 
  Regards, Tim
 
 Hi Tim,
 
 Thanks for your interest in my report.

;-) Well it's a bit like an interactive game...

 
 I am not sure whether my original report was correct though: I
 originally thought wget missed href URLs, but I now think my problem
 is with the authentication.
 Attached is the simple script I use to do my tests. Based on those
 tests (and having not had a look at wget's code yet) here is what I
 gather:
 -The --save-cookies invocation creates the cookies.txt file (which, as
 far as I can tell, contains the correct cookie information) and the
 ./index.html file which is a logged-in /cgi-bin/luci/index.html file
 with the href tags and all
 -The --load-cookies invocation doesn't use the ./index.html file (it
 can be deleted prior without changing the behaviour) and it creates a
 hierarchy in the 172.16.1.1 folder (or whatever your router's
 hostname is)
 -All files downloaded by the --load-cookies invocation seem to be the
 login page and the requisite files of that page
 -It seems not to be possible to use any combination of '-l', '-r' or
 '-m' 'with the '--post-data' option (with or without the
 --save-cookies option)
 
 Now, that is the behaviour I get on
 oli@Debianosaur:~/Downloads/wget-1.14/foo/usr/local/bin,0$ ./wget --version
 GNU Wget 1.14 built on linux-gnu.
 
 +digest +https +ipv6 -iri +large-file +nls -ntlm +opie +ssl/gnutls
 
 Wgetrc:
 /usr/local/etc/wgetrc (system)
 Locale: /usr/local/share/locale
 Compile: gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC=/usr/local/etc/wgetrc
 -DLOCALEDIR=/usr/local/share/locale -I. -I../lib -I../lib -O2
 -Wall
 Link: gcc -O2 -Wall -lgnutls -lgcrypt -lgpg-error -lz -lz -lrt ftp-opie.o
 gnutls.o ../lib/libgnu.a
 
 Copyright (C) 2011 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later
 http://www.gnu.org/licenses/gpl.html.
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.
 
 Originally written by Hrvoje Niksic hnik...@xemacs.org.
 Please send bug reports and questions to bug-wget@gnu.org.
 
 
 Do you get the same behaviour on your end? Or do you have a
 combination of options which allow you to get the target/main page of
 a --load-cookies invocation to download correctly?

Sorry, I was in a hurry and atached the wrong index.html.
Now, I'm at home without the openWRT stuff.

But i am shure your problem has nothing to do with authentication.
wget -r saved some index.html files, one of them contained a 'a href 
...admin/routes/', but it hasn't been parsed by wget... I am pretty shure 
about that. I take a look tomorrow...

Regards, Tim



Re: [Bug-wget] [Bug-Wget] Use of maltipart/form-data when using body-file command

2013-04-14 Thread Tim Rühsen
Am Sonntag, 14. April 2013 schrieb Darshit Shah:
 Assuming that my previous patch adding --method, --body-file and
 --body-data options is accepted and merged into master,
 I wanted to propose that we use Content-Type: multipart/form-data and send
 the whole file as-is when using the --body-file option.
 This allows us to add the long missing functionality to send files as
 attachments through wget, without having to change the working of the old
 options.

Why not look at curl (see --form) and decide, if it is the optimum or if there 
is a better way for the user to specify what he wants to upload.
And then implement the best option syntax.

 The only problem I currently see here is that there remains no way for a
 user to send body-data in a way that is cannot be seen by another user who
 can run ps.

This is a different problem and could be solved by extending the -e option:
a leading special character could say, the following characters are a filename. 
Then parse that file for 'commands' (aka options).

Regards, Tim



Re: [Bug-wget] Segmentation fault with current development version of wget

2013-04-30 Thread Tim Rühsen
Hi Darshit,

I understand that your patch was just a quick hack.

But even than you should avoid doing
opt.method == POST
for string comparisons.

This definitely not portable.
Not every compiler/linker aggregates two occurrences of the same static string 
into one address in memory.

You should use strcmp or strcasecmp to check if opt.method points to POST.

Regards, Tim

Am Dienstag, 30. April 2013 schrieb Darshit Shah:
 Okay, I could not prevent myself from looking into it.
 
 It seems as if the SUSPEND_POST_DATA macro was being called during a
 recursive download attempt.
 
 Attaching a hack around the situation. Will look more deeply when I have
 more time to identify what caused the regression.
 
 On Tue, Apr 30, 2013 at 9:09 PM, Darshit Shah dar...@gmail.com wrote:
 
  I'm in the middle of University exams at the moment. I'll still have a
  look at it tomorrow when I get a breather.
 
  However, it looks like wget is converting any method to a POST request
  which is weird since that should have caused it to fail most of the tests.
 
  I'll have to look into it and check what is causing this issue.
 
  On Tue, Apr 30, 2013 at 8:46 PM, Tim Ruehsen tim.rueh...@gmx.de wrote:
 
  Hi,
 
  you can even reproduce it with a simple
  wget -r http://translationproject.org/latest/make
 
  Darshit, maybe you can have a look at it. It has something to do with
  opt.method (set to read-only POST in http.c, line 1772).
 
  Even with --method=GET opt.method points to POST. And the code tries to
  uppercase it inline.
 
  Regards, Tim
 
  Am Tuesday 30 April 2013 schrieb Stefano Lattarini:
   Here is the reproducer:
  
 $ wget --passive-ftp -nv --recursive --level=1 --no-directories
   --no-parent \ --no-check-certificate -A '*.po'
   http://translationproject.org/latest/make 2013-04-30 16:41:25
   URL:https://translationproject.org/latest/make/ [5489/5489] - make
  [1]
   Segmentation fault
  
   Sorry, I don't have to look into this more deeply ATM, not to provide
   further feedback.  Hope you can reproduce this yourself; otherwise, 
I'll
   try to get back to you in the next days.
  
   Regards, and HTH,
 Stefano
 
  Mit freundlichem Gruß
 
   Tim Rühsen
 
 
 
 
  --
  Thanking You,
  Darshit Shah
  Research Lead, Code Innovation
  Kill Code Phobia.
  B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani
 
 
 
 
 -- 
 Thanking You,
 Darshit Shah
 Research Lead, Code Innovation
 Kill Code Phobia.
 B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani
 




Re: [Bug-wget] Segmentation fault with current development version of wget

2013-05-01 Thread Tim Rühsen
Am Mittwoch, 1. Mai 2013 schrieb Darshit Shah:
 First, sorry for the quick and dirty hack which was the perfect example of
 how NOT to do things.

Than it was a good example ;-)

 Secondly, it lies upon me that this feature wasn't tested before submitting
 the patch. I had however relied on the Test Environment and since it passed
 everything there, I thought it was working correctly. Guess, we should add
 a test for this soon. --recursive is a commonly used switch with Wget and
 not having a test to prevent regressions on it is very bad.

There are several tests using -r.
The question is why the problem doesn't come out.

 I am fixing this issue, but it is a terribly ugly hack. If someone could
 help improve it I'd be most truly grateful.
 I have a couple of ideas, but I will need to work them out and implement
 them when I have the time.
 
 The reason it has to be so ugly is that, we cannot use strcmp or strcasecmp
 on a NULL String, and we cannot initialize opt.method since that would
 break some sanity checks which are in place so that --post-* and --body-*
 commands don't conflict with each other.

Your test isn't really ugly.
I (and most C programmers) favor
opt.method  strcasecmp(opt.method,POST) == 0
instead of
opt.method ? strcasecmp(opt.method,POST) == 0 : false
But that is not really important.

It is pretty common, that one or both args to strcmp may be NULL. The really 
ugly thing is, that there are no string function alternatives that handle NULL 
pointers. I regularly use my own versions like strcmp_null() to avoid extra 
checks.

Regards, Tim



Re: [Bug-wget] How to tell wget to strip part following question mark in local filenames?

2013-05-08 Thread Tim Rühsen
Hi,

Am Mittwoch, 8. Mai 2013 schrieb Mark:
 Hi,
 
 I noticed some problems relating to URLs like
   
http://www.example.com/path/to/filename.zip?arg1=somestringarg2=anotherstring;...
 
 Wget doesn't strip the ? and following characters from the filename when
 creating local files. As far as I can tell it doesn't have an option to do
 that. This can cause several problems:
 
  - Local filenames have garbage following the actual extension which the
 user has to manually remove.

In many (most?) cases this is not garbage.
It is common, that different argument values returns different content.
To change the output file name for single downloads, use -O / --output-
document.

  - Depending on the web server, each download session may result in unique
 arguments in the URL (e.g. some kind of session ID), making it impossible
 to easily resume downloading partially-downloaded files. Wget would
 instead re-download the whole file, saving it under a different name.

When to resume a download, you are not in --recursive mode.
Again, -O should do it.

  - The worst problem is that when the arguments following the actual
 filename in the URL are very long, wget is unable to create the file at
 all, reporting
   File name too long

Again, this is only a problem when in recursive mode.
Here, a hash string (e.g. sha1 or md5) instead of the query part (and / or the 
filename part) could be helpful.
If needed, Wget could create a flat text file that maps hash codes to real 
filenames / urls in this cases.

Anyone with other ideas ?

 So this message is to suggest adding an option to tell wget to strip a
 question mark and everything after that from the filename part of URLs to
 get the local file name.

Thanks for your suggestion.

Regards, Tim



[Bug-wget] [PATCH] Regression since wget 1.10: no_prefix function is *bad*

2013-05-11 Thread Tim Rühsen
Hi Martin,

having an abort() without a message is simply a big waste of time for any 
developer who stumbles upon it.

Since the init code of Wget has to be rewritten anyways, i provide the fastest 
solution right now: increasing the buffer size and printing a message before 
Wget aborts.

And yes, the whole issue is hell stupid...

Regards, Tim


Am Freitag, 10. Mai 2013 schrieb Marlin Frickenschmidt:
 Hello dear wget maintainers,
 I want to report a bug of sorts. It is not a direct bug that impedes the
 operation of wget to normal users, but one which basically makes it
 impossible to add more command-line options to wget without sooner or
 later making wget suddenly SIGABRT - without any inclination to show a
 proper error whatsover.
 
 The reason for this is the no_prefix function, which is supposed to
 prepend the string no- to a given string (for disabling certain
 command line options). The function makes the assumption that the total
 length of all command line option names together won't exceed 1024
 characters, because the buffer storing the strings for all the
 no--prefixed command line options is only chosen that large. And so if
 it gets too big, there is just a silent abort(), no error message, no
 nothing.
 
 Why don't we use standard functions like asprintf to prepend strings,
 and instead build an own, completely broken function for it? Perhaps
 there are good reasons for building the string yourself instead of using
 the standard library; I don't know. In any case, there is no excuse for
 silently abort()ing without an error message. That is the only thing I
 am grumpy about, really...
 
 I hope this can be fixed in the next release for the sake of all wget
 developers around. I just spent two hours debugging this, and really
 couldn't believe my eyes when I found the reason for it. It is hell stupid.
 
 Cheers,
 Marlin
 
 

From e21315b9b4d41987479427ed203ae402695ec4df Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Sat, 11 May 2013 17:11:51 +0200
Subject: [PATCH 3/3] warn before abort() in main.c/no_prefix

---
 src/ChangeLog |4 
 src/main.c|7 +--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 3b6733e..bcff0f7 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,7 @@
+2013-05-11  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* main.c (init_switches): warn before abort() in no_prefix()
+
 2013-05-09  Tim Ruehsen  tim.rueh...@gmx.de
 
 	* cookies.c, ftp-ls.c, ftp.c, init.c, netrc.c, utils.c, utils.h:
diff --git a/src/main.c b/src/main.c
index 2b42d2d..70ce651 100644
--- a/src/main.c
+++ b/src/main.c
@@ -320,13 +320,15 @@ static struct cmdline_option option_data[] =
 static char *
 no_prefix (const char *s)
 {
-  static char buffer[1024];
+  static char buffer[2048];
   static char *p = buffer;
 
   char *cp = p;
   int size = 3 + strlen (s) + 1;  /* no-STRING\0 */
-  if (p + size = buffer + sizeof (buffer))
+  if (p + size = buffer + sizeof (buffer)) {
+fprintf (stderr, _(Internal error: size of 'buffer' in main.c/no-prefix() must be increased\n));
 abort ();
+  }
 
   cp[0] = 'n', cp[1] = 'o', cp[2] = '-';
   strcpy (cp + 3, s);
@@ -352,6 +354,7 @@ init_switches (void)
 {
   char *p = short_options;
   size_t i, o = 0;
+
   for (i = 0; i  countof (option_data); i++)
 {
   struct cmdline_option *opt = option_data[i];
-- 
1.7.10.4



[Bug-wget] [PATCH] bit cleanup in utils.c

2013-05-11 Thread Tim Rühsen
I replaced some hand-written string code by standard library functions.
In any case these functions may be found in gnulib as well.

Regards, Tim
From d540fd5dbd3644936a8ad1a384516abba10de268 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Thu, 9 May 2013 19:53:36 +0200
Subject: [PATCH 1/3] src/utils.c cleanup

---
 src/ChangeLog |6 ++
 src/utils.c   |   66 -
 2 files changed, 29 insertions(+), 43 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index f4fa342..84a9645 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,9 @@
+2013-05-09  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* utils.c: use standard string functions instead of self-written
+	code in acceptable(), match_tail(), suffix(), has_wildcards_p().
+	Avoid some warnings in test code.
+
 2013-05-05  mancha  manc...@hush.com (tiny change)
 
 	* gnutls.c (ssl_connect_wget): Don't abort on non-fatal alerts
diff --git a/src/utils.c b/src/utils.c
index faae62e..f7baed6 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -900,15 +900,14 @@ static bool in_acclist (const char *const *, const char *, bool);
 bool
 acceptable (const char *s)
 {
-  int l = strlen (s);
+  const char *p;
 
   if (opt.output_document  strcmp (s, opt.output_document) == 0)
 return true;
 
-  while (l  s[l] != '/')
---l;
-  if (s[l] == '/')
-s += (l + 1);
+  if ((p = strrchr(s, '/')))
+s = p + 1;
+
   if (opt.accepts)
 {
   if (opt.rejects)
@@ -919,6 +918,7 @@ acceptable (const char *s)
 }
   else if (opt.rejects)
 return !in_acclist ((const char *const *)opt.rejects, s, true);
+
   return true;
 }
 
@@ -1018,29 +1018,15 @@ accdir (const char *directory)
 bool
 match_tail (const char *string, const char *tail, bool fold_case)
 {
-  int i, j;
+  int pos = strlen (string) - strlen(tail);
 
-  /* We want this to be fast, so we code two loops, one with
- case-folding, one without. */
+  if (pos  0)
+	  return false; /* tail is longer than string */
 
   if (!fold_case)
-{
-  for (i = strlen (string), j = strlen (tail); i = 0  j = 0; i--, j--)
-if (string[i] != tail[j])
-  break;
-}
-  else
-{
-  for (i = strlen (string), j = strlen (tail); i = 0  j = 0; i--, j--)
-if (c_tolower (string[i]) != c_tolower (tail[j]))
-  break;
-}
-
-  /* If the tail was exhausted, the match was succesful.  */
-  if (j == -1)
-return true;
+return strcmp (string + pos, tail);
   else
-return false;
+return strcasecmp (string + pos, tail);
 }
 
 /* Checks whether string S matches each element of ACCEPTS.  A list
@@ -1089,15 +1075,12 @@ in_acclist (const char *const *accepts, const char *s, bool backward)
 char *
 suffix (const char *str)
 {
-  int i;
+  char *p;
 
-  for (i = strlen (str); i  str[i] != '/'  str[i] != '.'; i--)
-;
+  if ((p = strrchr(str, '.'))  !strchr(p + 1, '/'))
+	  return p + 1;
 
-  if (str[i++] == '.')
-return (char *)str + i;
-  else
-return NULL;
+  return NULL;
 }
 
 /* Return true if S contains globbing wildcards (`*', `?', `[' or
@@ -1106,10 +1089,7 @@ suffix (const char *str)
 bool
 has_wildcards_p (const char *s)
 {
-  for (; *s; s++)
-if (*s == '*' || *s == '?' || *s == '[' || *s == ']')
-  return true;
-  return false;
+	return !!strpbrk(s, *?[]);
 }
 
 /* Return true if FNAME ends with a typical HTML suffix.  The
@@ -2553,16 +2533,16 @@ get_max_length (const char *path, int length, int name)
 const char *
 test_subdir_p()
 {
-  int i;
-  struct {
-char *d1;
-char *d2;
+  static struct {
+const char *d1;
+const char *d2;
 bool result;
   } test_array[] = {
 { /somedir, /somedir, true },
 { /somedir, /somedir/d2, true },
 { /somedir/d1, /somedir, false },
   };
+  unsigned i;
 
   for (i = 0; i  countof(test_array); ++i)
 {
@@ -2578,10 +2558,9 @@ test_subdir_p()
 const char *
 test_dir_matches_p()
 {
-  int i;
-  struct {
-char *dirlist[3];
-char *dir;
+  static struct {
+const char *dirlist[3];
+const char *dir;
 bool result;
   } test_array[] = {
 { { /somedir, /someotherdir, NULL }, somedir, true },
@@ -2600,6 +2579,7 @@ test_dir_matches_p()
 { { /Tmp/has, NULL, NULL }, /Tmp/has space, false },
 { { /Tmp/has, NULL, NULL }, /Tmp/has,comma, false },
   };
+  unsigned i;
 
   for (i = 0; i  countof(test_array); ++i)
 {
-- 
1.7.10.4



[Bug-wget] [PATCH] replaced read_whole_file() by getline()

2013-05-11 Thread Tim Rühsen
Replaced read_whole_file(), which needs one malloc/free per line, by getline() 
which reuses a growable buffer.

getline() is a GNU function (but Wget is a GNU tool, isn't it ? :-).
Since Wget compiles/links with gnulib, I don't see a problem here.

Regards, Tim

From eeffb62ba0a13c5c20ddd66492c7774e0f713237 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Thu, 9 May 2013 22:37:17 +0200
Subject: [PATCH 2/3] replaced read_whole_file() by getline()

---
 src/ChangeLog |5 ++
 src/cookies.c |9 +++-
 src/ftp-ls.c  |  148 +++--
 src/ftp.c |   16 ---
 src/init.c|7 +--
 src/netrc.c   |   50 ++-
 src/utils.c   |   50 ---
 src/utils.h   |1 -
 8 files changed, 89 insertions(+), 197 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 84a9645..3b6733e 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,5 +1,10 @@
 2013-05-09  Tim Ruehsen  tim.rueh...@gmx.de
 
+	* cookies.c, ftp-ls.c, ftp.c, init.c, netrc.c, utils.c, utils.h:
+	replaced read_whole_file() by getline().
+
+2013-05-09  Tim Ruehsen  tim.rueh...@gmx.de
+
 	* utils.c: use standard string functions instead of self-written
 	code in acceptable(), match_tail(), suffix(), has_wildcards_p().
 	Avoid some warnings in test code.
diff --git a/src/cookies.c b/src/cookies.c
index 87cc554..4efda88 100644
--- a/src/cookies.c
+++ b/src/cookies.c
@@ -1129,7 +1129,9 @@ domain_port (const char *domain_b, const char *domain_e,
 void
 cookie_jar_load (struct cookie_jar *jar, const char *file)
 {
-  char *line;
+  char *line = NULL;
+  size_t bufsize = 0;
+
   FILE *fp = fopen (file, r);
   if (!fp)
 {
@@ -1137,9 +1139,10 @@ cookie_jar_load (struct cookie_jar *jar, const char *file)
  quote (file), strerror (errno));
   return;
 }
+
   cookies_now = time (NULL);
 
-  for (; ((line = read_whole_line (fp)) != NULL); xfree (line))
+  while (getline (line, bufsize, fp)  0)
 {
   struct cookie *cookie;
   char *p = line;
@@ -1233,6 +1236,8 @@ cookie_jar_load (struct cookie_jar *jar, const char *file)
 abort_cookie:
   delete_cookie (cookie);
 }
+
+  xfree(line);
   fclose (fp);
 }
 
diff --git a/src/ftp-ls.c b/src/ftp-ls.c
index 3056651..401ae77 100644
--- a/src/ftp-ls.c
+++ b/src/ftp-ls.c
@@ -68,16 +68,17 @@ symperms (const char *s)
replaces all TAB character with SPACE. Returns the length of the
modified line. */
 static int
-clean_line(char *line)
+clean_line(char *line, int len)
 {
-  int len = strlen (line);
-  if (!len) return 0;
-  if (line[len - 1] == '\n')
+  if (len = 0) return 0;
+
+  while (len  0  (line[len - 1] == '\n' || line[len - 1] == '\r'))
 line[--len] = '\0';
+
   if (!len) return 0;
-  if (line[len - 1] == '\r')
-line[--len] = '\0';
+
   for ( ; *line ; line++ ) if (*line == '\t') *line = ' ';
+
   return len;
 }
 
@@ -102,8 +103,9 @@ ftp_parse_unix_ls (const char *file, int ignore_perms)
   int hour, min, sec, ptype;
   struct tm timestruct, *tnow;
   time_t timenow;
+  size_t bufsize = 0;
 
-  char *line, *tok, *ptok;  /* tokenizer */
+  char *line = NULL, *tok, *ptok;  /* tokenizer */
   struct fileinfo *dir, *l, cur; /* list creation */
 
   fp = fopen (file, rb);
@@ -115,22 +117,16 @@ ftp_parse_unix_ls (const char *file, int ignore_perms)
   dir = l = NULL;
 
   /* Line loop to end of file: */
-  while ((line = read_whole_line (fp)) != NULL)
+  while ((len = getline (line, bufsize, fp))  0)
 {
-  len = clean_line (line);
+  len = clean_line (line, len);
   /* Skip if total...  */
   if (!strncasecmp (line, total, 5))
-{
-  xfree (line);
   continue;
-}
   /* Get the first token (permissions).  */
   tok = strtok (line,  );
   if (!tok)
-{
-  xfree (line);
   continue;
-}
 
   cur.name = NULL;
   cur.linkto = NULL;
@@ -368,7 +364,6 @@ ftp_parse_unix_ls (const char *file, int ignore_perms)
   DEBUGP ((Skipping.\n));
   xfree_null (cur.name);
   xfree_null (cur.linkto);
-  xfree (line);
   continue;
 }
 
@@ -416,10 +411,9 @@ ftp_parse_unix_ls (const char *file, int ignore_perms)
   timestruct.tm_isdst = -1;
   l-tstamp = mktime (timestruct); /* store the time-stamp */
   l-ptype = ptype;
-
-  xfree (line);
 }
 
+  xfree (line);
   fclose (fp);
   return dir;
 }
@@ -431,9 +425,10 @@ ftp_parse_winnt_ls (const char *file)
   int len;
   int year, month, day; /* for time analysis */
   int hour, min;
+  size_t bufsize = 0;
   struct tm timestruct;
 
-  char *line, *tok; /* tokenizer */
+  char *line = NULL, *tok; /* tokenizer */
   char *filename;
   struct fileinfo *dir, *l, cur; /* list creation */
 
@@ -446,29 +441,29 @@ ftp_parse_winnt_ls (const char *file)
   dir = l = NULL;
 
   /* Line loop to end of file: */
-  while ((line = 

Re: [Bug-wget] [PATCH] Regression since wget 1.10: no_prefix function is *bad*

2013-05-12 Thread Tim Rühsen
Am Sonntag, 12. Mai 2013 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
 
  having an abort() without a message is simply a big waste of time for any 
  developer who stumbles upon it.
 
 I disagree here, what is so difficult that a debugger cannot catch?  On
 the other hand, I agree this can be improved.
 
 
 
  Since the init code of Wget has to be rewritten anyways, i provide the 
fastest 
  solution right now: increasing the buffer size and printing a message 
before 
  Wget aborts.
 
  And yes, the whole issue is hell stupid...
 
  -  static char buffer[1024];
  +  static char buffer[2048];
 
 
 This won't really fix the problem of having a static buffer, the real
 fix would be to dynamically allocate the memory.

Yes, as I wrote, it is a quick hack.

A real solution would be a rewrite of the init stuff (I saw that already 
somewhere on the Wget 2.0 wish list or somewhere - don't remeber exactly).

I already wrote this kind of code and would contribute it to Wget.
But i am unshure how to apply it to Wget. Since it would be a pretty big 
change, should i git-clone Wget and you merge later or do you create a new 
branch or ...

Ah, than we again have to discuss that infamous c89/c99 thing.
AFAIR, the main argument against c99 came from Daniel Stenberg (Curl, haxx.se) 
who mentioned MS Visual C not being C99 ready (it will never be, said MS).
I just saw that Debian has MinGW cross compiler packets for Win32 and Win64 
with gcc 4.6, but I have no experience with those.
Does anybody know if that is a real alternative to MS VC ?

Regards, Tim



Re: [Bug-wget] [PATCH] Regression since wget 1.10: no_prefix function is *bad*

2013-05-13 Thread Tim Rühsen
Am Sonntag, 12. Mai 2013 schrieb Ángel González:
 On 12/05/13 21:50, Tim Rühsen wrote:
  A real solution would be a rewrite of the init stuff (I saw that already 
  somewhere on the Wget 2.0 wish list or somewhere - don't remeber exactly).
 
  I already wrote this kind of code and would contribute it to Wget.
  But i am unshure how to apply it to Wget. Since it would be a pretty big 
  change, should i git-clone Wget and you merge later or do you create a new 
  branch or ...
 
  Ah, than we again have to discuss that infamous c89/c99 thing.
  AFAIR, the main argument against c99 came from Daniel Stenberg (Curl, 
haxx.se) 
  who mentioned MS Visual C not being C99 ready (it will never be, said MS).
  I just saw that Debian has MinGW cross compiler packets for Win32 and 
Win64 
  with gcc 4.6, but I have no experience with those.
  Does anybody know if that is a real alternative to MS VC ?
 
  Regards, Tim
 Yes, it is a real alternative as a compiler which works :)
 However, I'm not sure how much does wget compile natively in win32 in
 right now,
 either with VC++ or gcc, mostly due to autoconf and gnulib detection.


Thanks for the hint. I just installed Debians MinGW packages and it worked 
like a charm (except a minor compile issue in url.c).

For anyone who wants to try:
$ make distclean
$ ./configure --host=i686-w64-mingw32 --without-ssl --disable-ipv6
$ make
$ wine src/wget.exe http://www.example.com

I have no real Windows installation around to test wget.exe.
But assuming, it works: Is there any need to stick at c89 code ?
Or in other words: do we still have to support native Windows compilation with 
MSVC (couldn't these people install and use mingw) ?

Regards, Tim



Re: [Bug-wget] [PATCH] Regression since wget 1.10: no_prefix function is *bad*

2013-05-14 Thread Tim Rühsen
Hi Alex,

yes, it is the _PC_NAME_MAX issue which is only valid for pathconf().

Attached is the little patch to fix it.

Since MinGW is based on gcc-4.6, C99 should be available.
As the Wiki states (well, the entry is 3 years old...), printf() is the main 
issue. But there may be some other functions that don't behave C99 compliant.
Sound not like a compiler, but a library issue.

Maybe some functions have to be provided by Wget. If we just had a list of 
issues/functions...

Regards, Tim

Am Montag, 13. Mai 2013 schrieb Bykov Aleksey:
 Greetings, Tim Rühsen.
 Possible that i'm understood wrong, but according to
 http://www.mingw.org/wiki/C99 MinGW doesn't support C99 at all. So i'm not
 sure about cross-compile.
 May be after implementation of C99 code it can be compiled only through
 CygWin (don't remember exactly, but two years ago Wget compiled at CygWin
 required on pure Windows only several libraries as dependencies. If need,
 i'll try to check this at week).
 
 P.S. Currently Wget compiled without any problem in Windows MinGW and
 resulting binary work in pure Windows without any dependencies.
 
 P.P.S. What minor compile issue in url.c? I get only error with
 undefined PC_NAME_MAX (just add it to compiler flags)?
 
 P.P.P.S. Sorry for bad English.
 
 Best regards, Alex.
 
  Thanks for the hint. I just installed Debians MinGW packages and it  
  workedlike a charm (except a minor compile issue in url.c).
 
  For anyone who wants to try:
  $ make distclean
  $ ./configure --host=i686-w64-mingw32 --without-ssl --disable-ipv6
  $ make
  $ wine src/wget.exe http://www.example.com
 
  I have no real Windows installation around to test wget.exe.
  But assuming, it works: Is there any need to stick at c89 code ?
  Or in other words: do we still have to support native Windows  
  compilation withMSVC (couldn't these people install and use mingw) ?
 
  Regards, Tim
 

From d540fd5dbd3644936a8ad1a384516abba10de268 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Thu, 9 May 2013 19:53:36 +0200
Subject: [PATCH 1/3] src/utils.c cleanup

---
 src/ChangeLog |6 ++
 src/utils.c   |   66 -
 2 files changed, 29 insertions(+), 43 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index f4fa342..84a9645 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,9 @@
+2013-05-09  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* utils.c: use standard string functions instead of self-written
+	code in acceptable(), match_tail(), suffix(), has_wildcards_p().
+	Avoid some warnings in test code.
+
 2013-05-05  mancha  manc...@hush.com (tiny change)
 
 	* gnutls.c (ssl_connect_wget): Don't abort on non-fatal alerts
diff --git a/src/utils.c b/src/utils.c
index faae62e..f7baed6 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -900,15 +900,14 @@ static bool in_acclist (const char *const *, const char *, bool);
 bool
 acceptable (const char *s)
 {
-  int l = strlen (s);
+  const char *p;
 
   if (opt.output_document  strcmp (s, opt.output_document) == 0)
 return true;
 
-  while (l  s[l] != '/')
---l;
-  if (s[l] == '/')
-s += (l + 1);
+  if ((p = strrchr(s, '/')))
+s = p + 1;
+
   if (opt.accepts)
 {
   if (opt.rejects)
@@ -919,6 +918,7 @@ acceptable (const char *s)
 }
   else if (opt.rejects)
 return !in_acclist ((const char *const *)opt.rejects, s, true);
+
   return true;
 }
 
@@ -1018,29 +1018,15 @@ accdir (const char *directory)
 bool
 match_tail (const char *string, const char *tail, bool fold_case)
 {
-  int i, j;
+  int pos = strlen (string) - strlen(tail);
 
-  /* We want this to be fast, so we code two loops, one with
- case-folding, one without. */
+  if (pos  0)
+	  return false; /* tail is longer than string */
 
   if (!fold_case)
-{
-  for (i = strlen (string), j = strlen (tail); i = 0  j = 0; i--, j--)
-if (string[i] != tail[j])
-  break;
-}
-  else
-{
-  for (i = strlen (string), j = strlen (tail); i = 0  j = 0; i--, j--)
-if (c_tolower (string[i]) != c_tolower (tail[j]))
-  break;
-}
-
-  /* If the tail was exhausted, the match was succesful.  */
-  if (j == -1)
-return true;
+return strcmp (string + pos, tail);
   else
-return false;
+return strcasecmp (string + pos, tail);
 }
 
 /* Checks whether string S matches each element of ACCEPTS.  A list
@@ -1089,15 +1075,12 @@ in_acclist (const char *const *accepts, const char *s, bool backward)
 char *
 suffix (const char *str)
 {
-  int i;
+  char *p;
 
-  for (i = strlen (str); i  str[i] != '/'  str[i] != '.'; i--)
-;
+  if ((p = strrchr(str, '.'))  !strchr(p + 1, '/'))
+	  return p + 1;
 
-  if (str[i++] == '.')
-return (char *)str + i;
-  else
-return NULL;
+  return NULL;
 }
 
 /* Return true if S contains globbing wildcards (`*', `?', `[' or
@@ -1106,10 +1089,7 @@ suffix (const char *str)
 bool
 has_wildcards_p (const char *s)
 {
-  for (; *s; s++)
-if (*s

Re: [Bug-wget] [PATCH] replaced read_whole_file() by getline()

2013-05-14 Thread Tim Rühsen
Sorry, forgot to switch my IDE to GNU style.

But now that I made all the requested changes to my working tree, how do I 
make a diff to some commit back in time or to upstream ? Especially with git 
format-patch ? Locally, I didn't create my own branch, so i am on master.
(I have to read a git book someday...)

Regards, Tim



Re: [Bug-wget] [PATCH] replaced read_whole_file() by getline()

2013-05-14 Thread Tim Rühsen
Thank you and Angel for your answers.

Am Dienstag, 14. Mai 2013 schrieb Daniel Stenberg:
 On Tue, 14 May 2013, Tim Rühsen wrote:
 
  But now that I made all the requested changes to my working tree, how do I 
  make a diff to some commit back in time or to upstream ? Especially with 
git 
  format-patch ? Locally, I didn't create my own branch, so i am on master. 
(I 
  have to read a git book someday...)
 
 'git commit --amend [files]'
 
 ... to update your (most recent) commit. Then you can git format-patch again 
 and resend.
 
 But yes, working on stuff like this in your own local branch is often a 
better 
 idea if you ask me...

Maybe it is a good idea since (at least i hope so) with an own branch you can 
still 'git pull' without conflicts while having own commits in the other 
branch. Have to try that out.

I didn't know how to amend one of the earlier commits.

Than I encountered 'git branch -a':
* master
  remotes/origin/HEAD - origin/master
  remotes/origin/master
  remotes/origin/parallel-wget

This let me try out 'git diff remotes/origin/master master'.
I piped the output into a file and hand-removed the unrelevant changes.

Not very elegant but should work.

Regards, Tim



Re: [Bug-wget] [PATCH] Regression since wget 1.10: no_prefix function is *bad*

2013-05-15 Thread Tim Rühsen
Hi Alex,

snprintf %a seems to print the correct result with wine (set to WinXP), but 
the same executable on a real WinXP just prints 'a'.

Replacing the sprintf() by __mingw_sprintf printed the correct result with 
wine and on the WinXP machine.

 About C99 - sorry, i think If article isn't changed/removed then it's
 still actually. Yes, in MinGW mailing lists i find mention about
 -std=c99 and changes to mingw_snprintf. Sorry for mistake.
 May be it's time for new test branch - wget-C99? And then find what
 fragments (isn't work)/(cannot be hooked) in MinGW/CygWin/MSVC and need to
 replace? Something like From practice to theory.

That is a good idea.
Should Guiseppe create a test branch upstream or how can we go on ?


  how do I make a diff to some commit back in time or to upstream ?
  From current to upstream
 git format-patch origin/master
 
  From current to some commit
 git format-patch ccd369d
 
 Single commit
 git format-patch ccd369d^!
 
 Between two commits
 git format-patch 027d9f...ae80fd2

Thanks.

Tim



Re: [Bug-wget] [PATCH] Regression since wget 1.10: no_prefix function is *bad*

2013-05-16 Thread Tim Rühsen
Am Donnerstag, 16. Mai 2013 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
 
  Hi Alex,
 
  yes, it is the _PC_NAME_MAX issue which is only valid for pathconf().
 
  Attached is the little patch to fix it.
 
  Since MinGW is based on gcc-4.6, C99 should be available.
  As the Wiki states (well, the entry is 3 years old...), printf() is the 
main 
  issue. But there may be some other functions that don't behave C99 
compliant.
  Sound not like a compiler, but a library issue.
 
  Maybe some functions have to be provided by Wget. If we just had a list of 
  issues/functions...
 
 most of these portability problems should be fixed by gnulib.  Have you
 checked those?  I am pretty sure that gnulib provides a POSIX compliant
 printf replacement on platforms that lack it.

Now, I checked the gnulib sources. And as far as I can see, the printf like 
functions are c99 compliant.

So we have a C99 (cross-) compiler plus a C99 function library (for Win32, 
Win64 and MSDOS). Are there any OSes that doesn't have a c99 compiler and that 
Wget needs/wants to provide support for ?

Background:
As many C programmers I am used to C99 and for me it is a PITA to check my 
patches to be 100% C89 compliant before sending them. It is not fun.
So I wish to see C99 acceptance in Wget in the future...

Regards, Tim



Re: [Bug-wget] [bug-wget] pod problems when compiling, with perl 5.18

2013-05-26 Thread Tim Rühsen
Am Sonntag, 26. Mai 2013 schrieb Javier Vasquez:
 On Sun, May 26, 2013 at 9:51 AM, Tim Rühsen tim.rueh...@gmx.de wrote:
  ...
 
  You can edit wget.texi and change all e.g. '@item number' into '@item
  string. I can't test it right here since perl 5.18 is still in 
experimental
  and has some dependancy problems right now.
 
  So, this is just an example patch, that should correct the issues with 
perl
  numbers. Not shure how e.g. @item '1' would appear in the man page, so I 
take
  @item 1..
 
 ...
 
  Regards, Tim
 
 That was really helpful Tim, made me look into texinfo documentation...
 
 The dot, ., after the number didn't help, I had to place it before
 instead, :-).
 
 Also if one has @item in the immediate following line after another
 @item, that's a failure, since texinfo requires @itemx instead.
 
 And even one has @item followed by @itemx, but with other lines in
 between the 2, somehow that's not understood, and @item is required...
 
 Weird thing...
 
 I'm attaching the patch that worked for me, and I'm pasting how the
 man looks like for the particular exit status after the fix:
 
 
 EXIT STATUS
Wget may return one of several error codes if it encounters problems.
 
.0  No problems occurred.
 
.1  Generic error code.
 
.2  Parse error - for instance, when parsing command-line
 options, the .wgetrc or .netrc...
 
.3  File I/O error.
 
.4  Network failure.
 
.5  SSL verification failure.
 
.6  Username/password authentication failure.
 
.7  Protocol errors.
 
.8  Server issued an error response.
 
 
 Well, it's able to build now, :-)
 
 --
 Javier.
 

Thanks, that's good to hear.

Could you give a try for   1  or  '1'  instead of  .1  ?
I wonder if that compiles and how it looks in the man page...
IF it works and looks good, maybe you could attach a new patch !?

Regards, Tim



Re: [Bug-wget] [bug-wget] pod problems when compiling, with perl 5.18

2013-05-26 Thread Tim Rühsen
Am Sonntag, 26. Mai 2013 schrieb Tim Rühsen:
 Am Sonntag, 26. Mai 2013 schrieb Javier Vasquez:
  On Sun, May 26, 2013 at 9:51 AM, Tim Rühsen tim.rueh...@gmx.de wrote:
   ...
  
   You can edit wget.texi and change all e.g. '@item number' into '@item
   string. I can't test it right here since perl 5.18 is still in 
 experimental
   and has some dependancy problems right now.
  
   So, this is just an example patch, that should correct the issues with 
 perl
   numbers. Not shure how e.g. @item '1' would appear in the man page, so I 
 take
   @item 1..
  
  ...
  
   Regards, Tim
  
  That was really helpful Tim, made me look into texinfo documentation...
  
  The dot, ., after the number didn't help, I had to place it before
  instead, :-).
  
  Also if one has @item in the immediate following line after another
  @item, that's a failure, since texinfo requires @itemx instead.
  
  And even one has @item followed by @itemx, but with other lines in
  between the 2, somehow that's not understood, and @item is required...
  
  Weird thing...
  
  I'm attaching the patch that worked for me, and I'm pasting how the
  man looks like for the particular exit status after the fix:
  
  
  EXIT STATUS
 Wget may return one of several error codes if it encounters 
problems.
  
 .0  No problems occurred.
  
 .1  Generic error code.
  
 .2  Parse error - for instance, when parsing command-line
  options, the .wgetrc or .netrc...
  
 .3  File I/O error.
  
 .4  Network failure.
  
 .5  SSL verification failure.
  
 .6  Username/password authentication failure.
  
 .7  Protocol errors.
  
 .8  Server issued an error response.
  
  
  Well, it's able to build now, :-)
  
  --
  Javier.
  
 
 Thanks, that's good to hear.
 
 Could you give a try for   1  or  '1'  instead of  .1  ?
 I wonder if that compiles and how it looks in the man page...
 IF it works and looks good, maybe you could attach a new patch !?
 
 Regards, Tim

Sorry, don't bother. I think it won't look as expected in the man page...

Regards, Tim



Re: [Bug-wget] GSoC 2013 project on rewriting Test Suite in Python.

2013-05-30 Thread Tim Rühsen
Hi Darshit,

congratulations for your selection !

I didn't know about your proposal, so I couldn't post my opinion...

In your proposal you write:
 The suggestion as one dev put it, “I would prefer a C test environment for a 
 C project, having tests written in C”.

I guess that was me ;-)

This is however, not an optimal solution. A major issue with writing the test 
 environment in C is the execution of external binaries. This would create a 
 lot of code clutter which can be easily avoided through the use of a
 scripting language.

I can't see this. The only external program that has to be executed (for every 
test) is Wget. Tests can always be merged into one executable, if that is a 
point anyway.

I know it is too late now (typical non-communication fault), but must have 
said that: I already started rewriting the test suite to C. (It is more a 
spin-off of writing a libwget - with that it was easy to implement a HTTP/HTTPS 
server).

I still appreciate your work - Python is way closer to C than Perl is.
So, if you have questions regarding the test suite or need some help, don't 
dare to ask me. I guess a Python test suite will look very similar.

Just a snippet from a C(99) Test program (I still use mget/MGET prefixes, think 
of MGET as WGET):

#include libtest.h

static const char *mainpage = \
html\n\
head\n\
  titleMain Page/title\n\
/head\n\
body\n\
  p\n\
Some text and a link to a a href=\http://localhost:
{{port}}/secondpage.html\second page/a.\n\
Also, a a href=\http://localhost:{{port}}/nonexistent\;broken link/a.
\n\
  /p\n\
/body\n\
/html\n;


int main(void)
{
mget_test_url_t urls[]={
{   .name = /index.html,
.code = 200 Dontcare,
.body = mainpage,
.headers = {
Content-Type: text/html,
}
},
};

// starting the server thread in the background
mget_test_start_http_server(
MGET_TEST_RESPONSE_URLS, urls, countof(urls),
0);

// 1. test--spider
mget_test(
MGET_TEST_OPTIONS, --spider,
MGET_TEST_REQUEST_URL, index.html,
MGET_TEST_EXPECTED_ERROR_CODE, 0,
0);

// 2. test--spider-fail
mget_test(
MGET_TEST_OPTIONS, --spider,
MGET_TEST_REQUEST_URL, nonexistent,
MGET_TEST_EXPECTED_ERROR_CODE, 8,
0);

// ... implemented ~30 test cases yet
}

Regards, Tim


Am Donnerstag, 30. Mai 2013 schrieb Darshit Shah:
 Hello to all!
 
 As many of you may have noticed, I have been contributing to Wget over the
 last couple of months. One of the major contributions has been support for
 RESTful scripting. It is still not refined and a couple of bugs need to be
 solved. That will be done before the window closes for the next release.
 
 However, I am also a student and have applied to and been selected for the
 Google Summer of Code, 2013. My proposal on which I am expected to work
 over the next 2 months is titled: Move Test bench Suite from Perl to
 Python.
 The complete proposal is public and can be viewed at:
 https://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/darnir/1
 
 Since this proposal affects the developers of Wget rather than the users,
 and this mailing list reaches all the major contributors to GNU Wget, I
 thought I should discuss the details on this list.
 
 I have currently proposed a structure similar to the current one in Perl:
 1. HTTPServer
 2. HTTPTest
 3. FTPServer
 4. FTPTest
 5. WgetTest
 6. runTests
 
 The individual test files will define the input URL’s and files, the
 expected returned pages, files to exist on Server and expected return code
 from Wget.
 A runTests module will accept extra Command Line parameters for Wget and
 will be used as the single point through which tests must be carried out.
 The WgetTest module will accept the parameters from the test files which
 may be overridden through parameters set through the runTests module. This
 module will also be tasked with creating the various output files that are
 required to be stored on the server. It will also fire up a HTTP / FTP
 Server on a separate thread and execute the required Wget command and
 collect it’s return code and output files.
 The HTTPServer / FTPServer modules are to be tasked with simply creating
 the respective servers with the required featureset (SSL, NTLM,
 cookie-auth, etc.)
 
 The main aim of this shift is to create a Test Environment that is more
 robust and easier to extend in terms of new tests.
 
 While this is what I proposed, I kindly request everyone to pitch in with
 their suggestions on what they would like to see in the new test suite.
 Features that are currently missing or nuances in the current test
 environment. What should be there and what shouldn't?
 
 -- 
 Thanking You,
 

Re: [Bug-wget] Limit number of links retrieved with --mirror

2013-05-31 Thread Tim Rühsen
Am Dienstag, 28. Mai 2013 schrieb David Linn:
 Is there a way I can limit the number of links retrieved via wget -m ?
 For example, just the first 100 links in a website.

Yes, having a *nix shell and grep/egrep around you can:

wget -m www.your-domain.org 21|egrep -m 100 'saved|no newer than'

If you also want some output on the screen, see 'man tee'.

Regards, Tim



Re: [Bug-wget] [PATCH] MinGW compatibility fixes

2013-06-16 Thread Tim Rühsen
Am Sonntag, 16. Juni 2013 schrieb Giuseppe Scrivano:
 Giuseppe Scrivano gscriv...@gnu.org writes:
 
  In my understanding, all new general patches should go into 'master', 
those 
  regarding metalink/multithreading should go into (experimental) 
'parallel-
  wget' for later merging with 'master'.
 
  obviously it has to be into master too, thanks to have reported it.  I
  am going to cherry-pick from parallel-wget into master.
 
 I have checked it, but I don't think that master needs this fix (the
 patch includes also an improvement, but nothing blocking to be on master
 ASAP).  Can you confirm this?

Some people stumbled upon the _PC_NAME_MAX stuff when compiling for Windows.
Just search the last 2 months for '_PC_NAME_MAX'.

I wondered why  Bykov Aleksey recommends ' Replace _PC_NAME_MAX with 256 
in  url.c:1620. Then run ...' (12th June). Since Rays patch should handle 
that. I'm shure I also send (at least one) patch regarding this issue, but I 
think it/they got lost or something.

However, it boils up and up again, but it is not a blocker that needs to be 
fixed ASAP. It's just annoying.

Regards, Tim



Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)

2013-07-11 Thread Tim Rühsen
Am Donnerstag, 11. Juli 2013 schrieb Tomas Hozza:
 Calling wget on https server with --timeout option does not work
 when the server does not answer SSL handshake. Note that this has
 been tested on wget-1.14 compiled with OpenSSL.

Hi,

here is the corresponding patch for GnuTLS.

Regards, Tim
From 5862c2e0e84838f40eda6332650bab10274bb211 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Thu, 11 Jul 2013 14:29:20 +0200
Subject: [PATCH] add connect timeout to gnutls code

---
 src/ChangeLog |  6 ++
 src/gnutls.c  | 63 +--
 2 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 5b978eb..c39cfcb 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,9 @@
+2013-07-11  Tim Ruehsen  tim.rueh...@gmx.de
+
+* gnutls.c (ssl_connect_wget): respect connect timeout
+
 2013-04-26  Tomas Hozza  tho...@redhat.com (tiny change)
 
 	* log.c (redirect_output): Use DEFAULT_LOGFILE in diagnostic message
diff --git a/src/gnutls.c b/src/gnutls.c
index 54422fc..a3b4ecc 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -374,6 +374,9 @@ static struct transport_implementation wgnutls_transport =
 bool
 ssl_connect_wget (int fd, const char *hostname)
 {
+#ifdef F_GETFL
+  int flags = 0;
+#endif
   struct wgnutls_transport_context *ctx;
   gnutls_session_t session;
   int err,alert;
@@ -441,11 +444,55 @@ ssl_connect_wget (int fd, const char *hostname)
   return false;
 }
 
+  if (opt.connect_timeout)
+{
+#ifdef F_GETFL
+  flags = fcntl (fd, F_GETFL, 0);
+  if (flags  0)
+return flags;
+  if (fcntl (fd, F_SETFL, flags | O_NONBLOCK))
+return -1;
+#else
+  /* XXX: Assume it was blocking before.  */
+  const int one = 1;
+  if (ioctl (fd, FIONBIO, one)  0)
+return -1;
+#endif
+}
+
   /* We don't stop the handshake process for non-fatal errors */
   do
 {
   err = gnutls_handshake (session);
-  if (err  0)
+
+  if (opt.connect_timeout  err == GNUTLS_E_AGAIN)
+{
+  if (gnutls_record_get_direction (session))
+{
+  /* wait for writeability */
+  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE);
+}
+  else
+{
+  /* wait for readability */
+  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ);
+}
+
+  if (err = 0)
+{
+  if (err == 0)
+{
+  errno = ETIMEDOUT;
+		err = -1;
+}
+
+  break;
+}
+
+			 if (err = 0)
+ break;
+}
+  else if (err  0)
 {
   logprintf (LOG_NOTQUIET, GnuTLS: %s\n, gnutls_strerror (err));
   if (err == GNUTLS_E_WARNING_ALERT_RECEIVED ||
@@ -461,6 +508,18 @@ ssl_connect_wget (int fd, const char *hostname)
 }
   while (err == GNUTLS_E_WARNING_ALERT_RECEIVED  gnutls_error_is_fatal (err) == 0);
 
+  if (opt.connect_timeout)
+{
+#ifdef F_GETFL
+  if (fcntl (fd, F_SETFL, flags)  0)
+return -1;
+#else
+  const int zero = 0;
+  if (ioctl (fd, FIONBIO, zero)  0)
+return -1;
+#endif
+}
+
   if (err  0)
 {
   gnutls_deinit (session);
@@ -468,7 +527,7 @@ ssl_connect_wget (int fd, const char *hostname)
 }
 
   ctx = xnew0 (struct wgnutls_transport_context);
-  ctx-session = session;
+	  ctx-session = session;
   fd_register_transport (fd, wgnutls_transport, ctx);
   return true;
 }
-- 
1.8.3.2



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)

2013-07-11 Thread Tim Rühsen
Am Donnerstag, 11. Juli 2013 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
 
  diff --git a/src/gnutls.c b/src/gnutls.c
  index 54422fc..a3b4ecc 100644
  --- a/src/gnutls.c
  +++ b/src/gnutls.c
 do
   {
 err = gnutls_handshake (session);
  -  if (err  0)
  +
  +  if (opt.connect_timeout  err == GNUTLS_E_AGAIN)
  +{
  +  if (gnutls_record_get_direction (session))
  +{
  +  /* wait for writeability */
  +  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE);
  +}
  +  else
  +{
  +  /* wait for readability */
  +  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ);
 
 since this is in a loop, should we also decrement the time we wait for
 at each iteration?  We do something similar in wgnutls_read_timeout.

I saw it, but took the routine from 'Mget' (It is my code, so I can contribute 
to Wget). This was a matter of time I had and I knew that it works.
The idea is to define 'connect_timeout' as the time nothing happens while 
connecting.
But please feel free to change it to work as in wgnutls_read_timeout().

BTW, maybe Wget should have something like Curls -m (a total/maximum timeout). 
I need such a thing in several projects (where I use Curl instead of Wget 
because of this reason).


 I have fixed some indentation problems and also I had some troubles to
 apply your patch with git am so I had to apply the changes
 separately.  Could you please use the version I have attached?

I locally revert my commit and pull it in from master.

To explain my repeated indentation problems:
The IDE I am working with (Netbeans) doesn't allow project-based indentation 
style. Since i always have several dozen project open, almost all of them 
having 'Linux' style, I have to hand-remove the tabs and replace them by 
spaces (in each line). Pretty awful, especially because the 'artifical 
intelligence' screws in from time to time. I really can't write larger Gnu 
code, just some fixes or hacks (though I would like to, but my poor nerves...).

Does Eclipse do it any better ?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug--Wget] Issue with RFC 2067 Digest Headers

2013-07-12 Thread Tim Rühsen
Hi,

we need a check in http.c:3759:
if (algorithm != NULL  ! strcmp (algorithm, MD5-sess))

else we strcmp() with algorithm being NULL.

That should do it.

Regards, Tim

Am Freitag, 12. Juli 2013 schrieb Darshit Shah:
 
  I have tried this response and wget just crashes here.  What is the
  value for `algorithm' at http.c:3763 (current master version,
  b8f036d16c) when you run it?
 
  It doesn't crash for me. I am currently on HEAD^5 and Wget cleanly exits
 here.
 How do I debug Wget when calling it from inside this script?
 
 
  I think we should set the default value to MD5.
 
  Yes, MD5 should be set as the default algorithm.
 
 
  I must specify both algorithm and qop in the response.
 
  But those attributes are not required by RFC 2067. Though it is obsolete,
 I guess we should support such servers.
 
 I will try and debug this more deeply tonight.
 
 -- 
 Thanking You,
 Darshit Shah
 



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug--Wget] Issue with RFC 2067 Digest Headers

2013-07-12 Thread Tim Rühsen
 +  realm = opaque = nonce = qop = NULL;
 +  algorithm = MD5;

Don't do that.
1. 'algorithm' will be xfreed later
2. this forces a 'algorithm=MD5 parameter even if it wasn't given before
Instead use:
 if (algorithm != NULL  ! strcmp (algorithm, MD5-sess))

The function does not free values allocated by strdupdelim () when returning.
That seems to be something that has never been done.

I hope, I am not too late ;-)

Regards, Tim

Am Freitag, 12. Juli 2013 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
 
  we need a check in http.c:3759:
  if (algorithm != NULL  ! strcmp (algorithm, MD5-sess))
 
  else we strcmp() with algorithm being NULL.
 
  That should do it.
 
 I think the fix should be:
 
 diff --git a/src/http.c b/src/http.c
 index a693355..9f274dc 100644
 --- a/src/http.c
 +++ b/src/http.c
 @@ -3703,7 +3703,8 @@ digest_authentication_encode (const char *au, const 
char *user,
param_token name, value;
  
  
 -  realm = opaque = nonce = qop = algorithm = NULL;
 +  realm = opaque = nonce = qop = NULL;
 +  algorithm = MD5;
  
au += 6;  /* skip over `Digest' */
while (extract_param (au, name, value, ','))
 @@ -3785,7 +3786,7 @@ digest_authentication_encode (const char *au, const 
char *user,
  md5_finish_ctx (ctx, hash);
  dump_hash (a2buf, hash);
  
 -if (!strcmp(qop, auth) || !strcmp (qop, auth-int))
 +if (qop  (!strcmp(qop, auth) || !strcmp (qop, auth-int)))
{
  /* RFC 2617 Digest Access Authentication */
  /* generate random hex string */
 @@ -3835,7 +3836,7 @@ digest_authentication_encode (const char *au, const 
char *user,
  
  res = xmalloc (res_size);
  
 -if (!strcmp(qop,auth))
 +if (qop  !strcmp (qop, auth))
{
  res_len = snprintf (res, res_size, Digest \
  username=\%s\, realm=\%s\, nonce=\%s\, uri=\%s\, 
response=\%s\\
 
 
 Any complain?
 
 Cheers,
 Giuseppe
 



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug--Wget] Issue with RFC 2067 Digest Headers

2013-07-13 Thread Tim Rühsen
Am Freitag, 12. Juli 2013 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
 
  +  realm = opaque = nonce = qop = NULL;
  +  algorithm = MD5;
 
  Don't do that.
  1. 'algorithm' will be xfreed later
  2. this forces a 'algorithm=MD5 parameter even if it wasn't given before
  Instead use:
   if (algorithm != NULL  ! strcmp (algorithm, MD5-sess))
 
  The function does not free values allocated by strdupdelim () when 
returning.
  That seems to be something that has never been done.
 
  I hope, I am not too late ;-)
 
 ops, sorry, I merged too quickly.  I want to change the behaviour, if
 the algorithm is not specified then assume MD5.  This is different than
 before as we assumed it is always specified, and it is incorrect as
 Darshit pointed out with his example which is compliant with the RFC.

We assumed 'algorithm' to be MD5 before (implicitely), but had the bug to miss 
one check before strcmp().

For me, your changes look ok.

There is just that little issue (more a kind of favour) that I mentioned under 
as 2. : When the server does not mention 'algorithm' in WWW-Authenticate:, 
should we introduce it in the clients Authenticate: Header ? I can't say what 
is better... RFC 2069 and RFC 2617 leave it open.
At least we would introduce an additional (unneeded) xstrdup/free.

So, the decision is yours ;-)

Regards, Tim

 
 Something against this, I wait for your ACK this time :-)
 
 diff --git a/src/http.c b/src/http.c
 index 9f274dc..3af7009 100644
 --- a/src/http.c
 +++ b/src/http.c
 @@ -3703,8 +3703,7 @@ digest_authentication_encode (const char *au, const 
char *user,
param_token name, value;
  
  
 -  realm = opaque = nonce = qop = NULL;
 -  algorithm = MD5;
 +  realm = opaque = nonce = algorithm = qop = NULL;
  
au += 6;  /* skip over `Digest' */
while (extract_param (au, name, value, ','))
 @@ -3743,6 +3742,9 @@ digest_authentication_encode (const char *au, const 
char *user,
return NULL;
  }
  
 +  if (algorithm == NULL)
 +algorithm = xstrdup (MD5);
 +
/* Calculate the digest value.  */
{
  struct md5_ctx ctx;
 @@ -3829,7 +3831,7 @@ digest_authentication_encode (const char *au, const 
char *user,
   + strlen (path)
   + 2 * MD5_DIGEST_SIZE /*strlen (response_digest)*/
   + (opaque ? strlen (opaque) : 0)
 - + (algorithm ? strlen (algorithm) : 0)
 + + strlen (algorithm)
   + (qop ? 128: 0)
   + strlen (cnonce)
   + 128;
 @@ -3856,11 +3858,15 @@ digest_authentication_encode (const char *au, const 
char *user,
  res_len += snprintf(res + res_len, res_size - res_len, , 
opaque=\%s\, opaque);
}
  
 -if (algorithm)
 -  {
 -snprintf(res + res_len, res_size - res_len, , algorithm=\%s\, 
algorithm);
 -  }
 +snprintf(res + res_len, res_size - res_len, , algorithm=\%s\, 
algorithm);
}
 +
 +  xfree_null (realm);
 +  xfree_null (opaque);
 +  xfree_null (nonce);
 +  xfree_null (qop);
 +  xfree_null (algorithm);
 +
return res;
  }
  #endif /* ENABLE_DIGEST */
 



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] NTLM auth broken in 1.13.4

2013-07-13 Thread Tim Rühsen
Am Mittwoch, 10. Juli 2013 schrieb Hrvoje Niksic:
 The NTLM code kindly donated by Daniel has always required OpenSSL.
 configure.ac says:

 Updating the code to also support GNU/TLS appears straightforward.

Here is a (quick) patch for testing using libnettle (which GnuTLS relies on 
anyway).
I can't test it myself since lack of an NTLM capable server.

Please could anyone test it and review it, especially the configure.ac stuff 
which is not one of my strengths.
Do we need to mention libnettle somewhere in the docs ?

Regards, Tim
diff --git a/configure.ac b/configure.ac
index a413b75..f36c71b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -339,11 +339,25 @@ then
 AC_LIBOBJ([http-ntlm])
   fi
 else
-  dnl If SSL is unavailable and the user explicitly requested NTLM,
-  dnl abort.
-  if test x$ENABLE_NTLM = xyes
+  AC_CHECK_LIB(nettle, nettle_md4_init, [with_nettle=yes; AC_SUBST(NETTLE_LIBS, -lnettle) AC_DEFINE([WITH_NETTLE], [1], [Use libnettle])], [with_nettle=no; AC_MSG_WARN(*** libnettle was not found. You will not be able to use NTLM)])
+  AM_CONDITIONAL([WITH_NETTLE], [test x$with_nettle = xyes])
+
+  if test x$with_nettle = xyes
   then
-AC_MSG_ERROR([NTLM authorization requested and OpenSSL not found; aborting])
+if test x$ENABLE_NTLM != xno
+then
+  AC_DEFINE([ENABLE_NTLM], 1,
+   [Define if you want the NTLM authorization support compiled in.])
+  AC_LIBOBJ([http-ntlm])
+  LIBS=$NETTLE_LIBS $LIBS
+fi
+  else
+dnl If SSL is unavailable and the user explicitly requested NTLM,
+dnl abort.
+if test x$ENABLE_NTLM = xyes
+then
+  AC_MSG_ERROR([NTLM authorization requested and SSL not enabled; aborting])
+fi
   fi
 fi
 
diff --git a/src/http-ntlm.c b/src/http-ntlm.c
index 86eca66..63b1262 100644
--- a/src/http-ntlm.c
+++ b/src/http-ntlm.c
@@ -42,27 +42,33 @@ as that of the covered work.  */
 #include string.h
 #include stdlib.h
 
-#include openssl/des.h
-#include openssl/md4.h
-#include openssl/opensslv.h
-
 #include utils.h
 #include http-ntlm.h
 
-#if OPENSSL_VERSION_NUMBER  0x00907001L
-#define DES_key_schedule des_key_schedule
-#define DES_cblock des_cblock
-#define DES_set_odd_parity des_set_odd_parity
-#define DES_set_key des_set_key
-#define DES_ecb_encrypt des_ecb_encrypt
-
-/* This is how things were done in the old days */
-#define DESKEY(x) x
-#define DESKEYARG(x) x
+#ifdef WITH_NETTLE
+#	include nettle/md4.h
+#	include nettle/des.h
 #else
-/* Modern version */
-#define DESKEYARG(x) *x
-#define DESKEY(x) x
+#	include openssl/des.h
+#	include openssl/md4.h
+#	include openssl/opensslv.h
+
+#	if OPENSSL_VERSION_NUMBER  0x00907001L
+#		define DES_key_schedule des_key_schedule
+#		define DES_cblock des_cblock
+#		define DES_set_odd_parity des_set_odd_parity
+#		define DES_set_key des_set_key
+#		define DES_ecb_encrypt des_ecb_encrypt
+
+		/* This is how things were done in the old days */
+#		define DESKEY(x) x
+#		define DESKEYARG(x) x
+#	else
+		/* Modern version */
+#		define DESKEYARG(x) *x
+#		define DESKEY(x) x
+#	endif
+
 #endif
 
 /* Define this to make the type-3 message include the NT response message */
@@ -176,6 +182,25 @@ ntlm_input (struct ntlmdata *ntlm, const char *header)
  * Turns a 56 bit key into the 64 bit, odd parity key and sets the key.  The
  * key schedule ks is also set.
  */
+#ifdef WITH_NETTLE
+static void
+setup_des_key(unsigned char *key_56,
+  struct des_ctx *des)
+{
+  unsigned char key[8];
+
+  key[0] = key_56[0];
+  key[1] = ((key_56[0]  7)  0xFF) | (key_56[1]  1);
+  key[2] = ((key_56[1]  6)  0xFF) | (key_56[2]  2);
+  key[3] = ((key_56[2]  5)  0xFF) | (key_56[3]  3);
+  key[4] = ((key_56[3]  4)  0xFF) | (key_56[4]  4);
+  key[5] = ((key_56[4]  3)  0xFF) | (key_56[5]  5);
+  key[6] = ((key_56[5]  2)  0xFF) | (key_56[6]  6);
+  key[7] =  (key_56[6]  1)  0xFF;
+
+  nettle_des_set_key(des, key);
+}
+#else
 static void
 setup_des_key(unsigned char *key_56,
   DES_key_schedule DESKEYARG(ks))
@@ -194,7 +219,7 @@ setup_des_key(unsigned char *key_56,
   DES_set_odd_parity(key);
   DES_set_key(key, ks);
 }
-
+#endif
  /*
   * takes a 21 byte array and treats it as 3 56-bit DES keys. The
   * 8 byte plaintext is encrypted with each key and the resulting 24
@@ -203,6 +228,18 @@ setup_des_key(unsigned char *key_56,
 static void
 calc_resp(unsigned char *keys, unsigned char *plaintext, unsigned char *results)
 {
+#ifdef WITH_NETTLE
+  struct des_ctx des;
+
+  setup_des_key(keys, des);
+  nettle_des_encrypt(des, 8, results, plaintext);
+
+  setup_des_key(keys + 7, des);
+  nettle_des_encrypt(des, 8, results + 8, plaintext);
+
+  setup_des_key(keys + 14, des);
+  nettle_des_encrypt(des, 8, results + 16, plaintext);
+#else
   DES_key_schedule ks;
 
   setup_des_key(keys, DESKEY(ks));
@@ -216,6 +253,7 @@ calc_resp(unsigned char *keys, unsigned char *plaintext, unsigned char *results)
   setup_des_key(keys+14, DESKEY(ks));
   DES_ecb_encrypt((DES_cblock*) plaintext, (DES_cblock*) (results+16),
 

Re: [Bug-wget] [Bug--Wget] Issue with RFC 2067 Digest Headers

2013-07-14 Thread Tim Rühsen
Am Sonntag, 14. Juli 2013, 00:47:48 schrieb Giuseppe Scrivano:
 Darshit Shah dar...@gmail.com writes:
  Do you know a test HTTP server that supports auth-int ?
  If yes, we could try to implement it.
  
  In the Test Suite I am currently writing, I had the server to simply send
  
  a qop=auth-int without really supporting it to see how Wget responds.
  I can implement auth-int support in that server too if we have any
  intentions of adding support to Wget.
  
  You are right:
  At the moment any other qop value than 'auth' or missing qop return
  throws
  out
  
logprintf (LOG_NOTQUIET, _(Unsupported quality of protection
  
  '%s'.\n),
  qop);
  and returns NULL, wich in turn just removes the Authenticate header but
  doesn't stop the GET request (in gethttp()).
  
  If we want that, digest_authentication_encode() would need to return a
  status/error code.
  
  Yes. A way to exit out of the loop instantly.
  
  But this issue should not stop Guiseppe's patch.
  
  His patch is already pushed.
 
 I think we can address this issue separately.  It is a nice optimization
 to have but I wouldn't consider it as a blocking bug for the release.

Darshit, could you please test this patch ?

###
diff --git a/src/http.c b/src/http.c
index 669f0fe..d50f20e 100644
--- a/src/http.c
+++ b/src/http.c
@@ -2379,28 +2379,38 @@ read_header:
   else if (!basic_auth_finished
|| !BEGINS_WITH (www_authenticate, Basic))
 {
-  char *pth;
-  pth = url_full_path (u);
-  request_set_header (req, Authorization,
-  create_authorization_line 
(www_authenticate,
- user, passwd,
- request_method 
(req),
- pth,
- auth_finished),
-  rel_value);
-  if (BEGINS_WITH (www_authenticate, NTLM))
-ntlm_seen = true;
-  else if (!u-user  BEGINS_WITH (www_authenticate, Basic))
+  char *pth = url_full_path (u);
+ const char *value;
+ 
+  value =  create_authorization_line (www_authenticate,
+  user, passwd,
+  request_method (req),
+  pth,
+  auth_finished);
+
+ if (value)
+   {
+  request_set_header (req, Authorization, value, 
rel_value);
+
+  if (BEGINS_WITH (www_authenticate, NTLM))
+ntlm_seen = true;
+  else if (!u-user  BEGINS_WITH (www_authenticate, 
Basic))
+{
+  /* Need to register this host as using basic auth,
+   * so we automatically send creds next time. */
+  register_basic_auth_host (u-host);
+}
+
+  xfree (pth);
+  xfree_null (message);
+  resp_free (resp);
+  xfree (head);
+  goto retry_with_auth;
+}
+  else
 {
-  /* Need to register this host as using basic auth,
-   * so we automatically send creds next time. */
-  register_basic_auth_host (u-host);
+  /* Creating the Authorization header went wrong */
 }
-  xfree (pth);
-  xfree_null (message);
-  resp_free (resp);
-  xfree (head);
-  goto retry_with_auth;
 }
   else
 {
###

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug-Wget] Wrong Error Codes returned on Digest Auth Failures.

2013-07-14 Thread Tim Rühsen
Am Sonntag, 14. Juli 2013, 19:02:36 schrieb Darshit Shah:
 Hi,
 
 In http.c:3739, we club 3 different error types in one.
 
 1. The Server did not send a nonce / realm / uri attribute in the
 WWW-Authenticate Header. This should exit as a Protocol Error, Status 7
 2. Wget was not invoked with a Username / Password. Authorization Error,
 Status 6.
 3. We do not understand the qop / algorithm value sent by the server.
 General Error, Status 1.
 
 I think we should split these errors and exit Wget with the correct status
 codes. It would be wrong to inform the end-user that there was a
 username/password authentication error when in reality the issue was caused
 by, say, the Server sending an incorrect header.
 
 If there is consensus on fixing this, I'll try and hack a patch soon.

Sounds like a good idea.
That would obsolete my patch that you tested today (thanks for testing).

Are you also working on NTLM support for your server ?
If so, could you test my GnuTLS NTLM patch from yesterday, so Guiseppe can 
apply it if it works.

Thanks, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug-Wget] Wrong Error Codes returned on Digest Auth Failures.

2013-07-15 Thread Tim Rühsen
Am Montag, 15. Juli 2013, 03:34:46 schrieb Darshit Shah:

 Wait, this is all Client End. Wget already has NTLM client-end support.
 I need to write a Test Server for it.  

There seems to be an Apache module for NTLM at
http://modntlm.sourceforge.net/

You should put writing an own NTLM server to low priority.
I probably can test the libnettle code with a local Apache server running 
mod_ntlm. But it doesn't have high priority for me either.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] NTLM auth broken in 1.13.4

2013-07-15 Thread Tim Rühsen
Am Montag, 15. Juli 2013, 09:50:27 schrieb Tom Merriam:
 On 07/13/2013 08:00 AM, Tim Rühsen wrote:
  Am Mittwoch, 10. Juli 2013 schrieb Hrvoje Niksic:
  The NTLM code kindly donated by Daniel has always required OpenSSL.
  configure.ac says:
  
  Updating the code to also support GNU/TLS appears straightforward.
  
  Here is a (quick) patch for testing using libnettle (which GnuTLS relies
  on
  anyway).
  I can't test it myself since lack of an NTLM capable server.
  
  Please could anyone test it and review it, especially the configure.ac
  stuff which is not one of my strengths.
  Do we need to mention libnettle somewhere in the docs ?
  
  Regards, Tim
 
 I applied this patch to 1.13.4 and built with configure  make but it
 didn't work. I still get the same 'unknown authentication scheme' error.
 
 Does wget need to be built differently for this to work?

Thanks, Tom.

src/config.h should contain the lines
#define ENABLE_NTLM 1
and
#define WITH_NETTLE 1

Maybe 1.13.4 ./configure needs explicitely --with-ssl=gnutls --with-ntlm.

src/wget --version printout:
GNU Wget 1.14.61-5862-dirty built on linux-gnu.

+digest +https +ipv6 +iri +large-file +nls +ntlm +opie +ssl/gnutls 
...

should look similar to the above line. Important is +ntlm, +ssl/gnutls and I 
guess +digest.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] NTLM auth broken in 1.13.4

2013-07-16 Thread Tim Rühsen
Am Montag, 15. Juli 2013, 16:48:36 schrieb Tom Merriam:
  When built with configure  make (with that patch):
  
  GNU Wget 1.14 built on linux-gnu.
  
  +digest +https +ipv6 -iri +large-file +nls -ntlm +opie +ssl/gnutls
  
  Wgetrc:
  /usr/local/etc/wgetrc (system)
  
  Locale: /usr/local/share/locale
  Compile: gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC=/usr/local/etc/wgetrc
  
  -DLOCALEDIR=/usr/local/share/locale -I. -I../lib -I../lib -O2
  -Wall
  
  Link: gcc -O2 -Wall -lgnutls -lgcrypt -lgpg-error -lz -lz -lrt ftp-opie.o
  
  gnutls.o ../lib/libgnu.a
  
  I downloaded 1.13.4 and applied the patch and built with configure
  --with-ssl=gnutls --enable-ntlm  make
  (configure doesn't accept --with-ntlm)
  
  I don't see WITH_NETTLE in config.h and the build fails with:
  
  configure: error: NTLM authorization requested and OpenSSL not found;
  aborting
  
  Am I applying the patch incorrectly or against the wrong version?
  
  Hi Tom,
  
  Wget 1.14 is perfect.
  
  I don't see WITH_NETTLE in config.h and the build fails with:
  configure: error: NTLM authorization requested and OpenSSL not found;
  aborting
  
  Sorry, I forgot to say: after patching, you should first call
  
  autoreconf
  
  to create a new version of configure.
  After that you do a ./configure.
  
  
  If it still does not work:
  
  Either this comes because you don't have libnettle-dev installed (ls -la
  /usr/include/nettle should show up with a bunch of header files).
  
  Or you did not apply the first part of the patch (configure.ac).
  
  
  Regards, Tim
 
 Okay, I installed nettle-dev and ran autoreconf. Now the build (1.14)
 fails with
 
 gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\
 -DLOCALEDIR=\/usr/local/share/locale\ -I.  -I../lib -I../lib   -O2
 -Wall -MT http-ntlm.o -MD -MP -MF .deps/http-ntlm.Tpo -c -o http-ntlm.o
 http-ntlm.c
 mv -f .deps/retr.Tpo .deps/retr.Po
 echo '#include wget.h'  css_.c
 cat css.c  css_.c
 cat: css.c: No such file or directory
 make[3]: *** [css_.c] Error 1
 make[3]: *** Waiting for unfinished jobs
 mv -f .deps/ftp-opie.Tpo .deps/ftp-opie.Po
 mv -f .deps/http-ntlm.Tpo .deps/http-ntlm.Po
 gnutls.c: In function 'ssl_connect_wget':
 gnutls.c:394:38: warning: cast to pointer from integer of different size
 [-Wint-to-pointer-cast]
 mv -f .deps/warc.Tpo .deps/warc.Po
 mv -f .deps/gnutls.Tpo .deps/gnutls.Po
 mv -f .deps/url.Tpo .deps/url.Po
 mv -f .deps/http.Tpo .deps/http.Po
 mv -f .deps/utils.Tpo .deps/utils.Po
 make[3]: Leaving directory `/home/tmerriam/src/wget-1.14/src'
 make[2]: *** [all] Error 2
 make[2]: Leaving directory `/home/tmerriam/src/wget-1.14/src'
 make[1]: *** [all-recursive] Error 1
 make[1]: Leaving directory `/home/tmerriam/src/wget-1.14'
 make: *** [all] Error 2
 
 It looks like 'css.c' should be 'css.l' somewhere?

You installed nettle-dev and ran 'autoreconf'.
Now you should run
./configure
make clean
make

BTW, css_.c and css.c should automatically be constructed from css.l which is 
a flex rules file (flex is a parser generator). But this has nothing to do with 
my patch...

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] NTLM auth broken in 1.13.4

2013-07-16 Thread Tim Rühsen
Am Dienstag, 16. Juli 2013, 08:34:28 schrieb Tom Merriam:
 I tried that, but had the same problem. I gave up, re-extracted the
 source from archive, and reapplied patch. I compiled it with configure
  make and it works!
 
 I am able to authenticate with my Windows Server.
 
 GNU Wget 1.14 built on linux-gnu.
 
 +digest +https +ipv6 -iri +large-file +nls +ntlm +opie +ssl/gnutls
 
 Wgetrc:
 /usr/local/etc/wgetrc (system)
 Locale: /usr/local/share/locale
 Compile: gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC=/usr/local/etc/wgetrc
 -DLOCALEDIR=/usr/local/share/locale -I. -I../lib -I../lib -O2
 -Wall
 Link: gcc -O2 -Wall -lnettle -lgnutls -lgcrypt -lgpg-error -lz -lz -lrt
 ftp-opie.o gnutls.o http-ntlm.o ../lib/libgnu.a

Tom!!! That's great !

Thank you very much for your time and patience!

@Giuseppe: I will git format-patch the stuff this evening.
Is there anything missing ?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] [PATCH] NTLM auth with GnuTLS/Nettle

2013-07-16 Thread Tim Rühsen
Sorry, again git bugged me.
Somehow the last pull merged together with my local changes and everything is 
in one commit now.

So, the attached patch is not in git-format-patch format and a ChangeLog entry 
is missing:

2013-07-16  Tim Ruehsen  tim.rueh...@gmx.de

* NTLM support using libnettle
* Requested and tested by Tom Merriam

Regards, Timdiff --git a/configure.ac b/configure.ac
index a413b75..f36c71b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -339,11 +339,25 @@ then
 AC_LIBOBJ([http-ntlm])
   fi
 else
-  dnl If SSL is unavailable and the user explicitly requested NTLM,
-  dnl abort.
-  if test x$ENABLE_NTLM = xyes
+  AC_CHECK_LIB(nettle, nettle_md4_init, [with_nettle=yes; AC_SUBST(NETTLE_LIBS, -lnettle) AC_DEFINE([WITH_NETTLE], [1], [Use libnettle])], [with_nettle=no; AC_MSG_WARN(*** libnettle was not found. You will not be able to use NTLM)])
+  AM_CONDITIONAL([WITH_NETTLE], [test x$with_nettle = xyes])
+
+  if test x$with_nettle = xyes
   then
-AC_MSG_ERROR([NTLM authorization requested and OpenSSL not found; aborting])
+if test x$ENABLE_NTLM != xno
+then
+  AC_DEFINE([ENABLE_NTLM], 1,
+   [Define if you want the NTLM authorization support compiled in.])
+  AC_LIBOBJ([http-ntlm])
+  LIBS=$NETTLE_LIBS $LIBS
+fi
+  else
+dnl If SSL is unavailable and the user explicitly requested NTLM,
+dnl abort.
+if test x$ENABLE_NTLM = xyes
+then
+  AC_MSG_ERROR([NTLM authorization requested and SSL not enabled; aborting])
+fi
   fi
 fi
 
diff --git a/src/http-ntlm.c b/src/http-ntlm.c
index 86eca66..63b1262 100644
--- a/src/http-ntlm.c
+++ b/src/http-ntlm.c
@@ -42,27 +42,33 @@ as that of the covered work.  */
 #include string.h
 #include stdlib.h
 
-#include openssl/des.h
-#include openssl/md4.h
-#include openssl/opensslv.h
-
 #include utils.h
 #include http-ntlm.h
 
-#if OPENSSL_VERSION_NUMBER  0x00907001L
-#define DES_key_schedule des_key_schedule
-#define DES_cblock des_cblock
-#define DES_set_odd_parity des_set_odd_parity
-#define DES_set_key des_set_key
-#define DES_ecb_encrypt des_ecb_encrypt
-
-/* This is how things were done in the old days */
-#define DESKEY(x) x
-#define DESKEYARG(x) x
+#ifdef WITH_NETTLE
+#	include nettle/md4.h
+#	include nettle/des.h
 #else
-/* Modern version */
-#define DESKEYARG(x) *x
-#define DESKEY(x) x
+#	include openssl/des.h
+#	include openssl/md4.h
+#	include openssl/opensslv.h
+
+#	if OPENSSL_VERSION_NUMBER  0x00907001L
+#		define DES_key_schedule des_key_schedule
+#		define DES_cblock des_cblock
+#		define DES_set_odd_parity des_set_odd_parity
+#		define DES_set_key des_set_key
+#		define DES_ecb_encrypt des_ecb_encrypt
+
+		/* This is how things were done in the old days */
+#		define DESKEY(x) x
+#		define DESKEYARG(x) x
+#	else
+		/* Modern version */
+#		define DESKEYARG(x) *x
+#		define DESKEY(x) x
+#	endif
+
 #endif
 
 /* Define this to make the type-3 message include the NT response message */
@@ -176,6 +182,25 @@ ntlm_input (struct ntlmdata *ntlm, const char *header)
  * Turns a 56 bit key into the 64 bit, odd parity key and sets the key.  The
  * key schedule ks is also set.
  */
+#ifdef WITH_NETTLE
+static void
+setup_des_key(unsigned char *key_56,
+  struct des_ctx *des)
+{
+  unsigned char key[8];
+
+  key[0] = key_56[0];
+  key[1] = ((key_56[0]  7)  0xFF) | (key_56[1]  1);
+  key[2] = ((key_56[1]  6)  0xFF) | (key_56[2]  2);
+  key[3] = ((key_56[2]  5)  0xFF) | (key_56[3]  3);
+  key[4] = ((key_56[3]  4)  0xFF) | (key_56[4]  4);
+  key[5] = ((key_56[4]  3)  0xFF) | (key_56[5]  5);
+  key[6] = ((key_56[5]  2)  0xFF) | (key_56[6]  6);
+  key[7] =  (key_56[6]  1)  0xFF;
+
+  nettle_des_set_key(des, key);
+}
+#else
 static void
 setup_des_key(unsigned char *key_56,
   DES_key_schedule DESKEYARG(ks))
@@ -194,7 +219,7 @@ setup_des_key(unsigned char *key_56,
   DES_set_odd_parity(key);
   DES_set_key(key, ks);
 }
-
+#endif
  /*
   * takes a 21 byte array and treats it as 3 56-bit DES keys. The
   * 8 byte plaintext is encrypted with each key and the resulting 24
@@ -203,6 +228,18 @@ setup_des_key(unsigned char *key_56,
 static void
 calc_resp(unsigned char *keys, unsigned char *plaintext, unsigned char *results)
 {
+#ifdef WITH_NETTLE
+  struct des_ctx des;
+
+  setup_des_key(keys, des);
+  nettle_des_encrypt(des, 8, results, plaintext);
+
+  setup_des_key(keys + 7, des);
+  nettle_des_encrypt(des, 8, results + 8, plaintext);
+
+  setup_des_key(keys + 14, des);
+  nettle_des_encrypt(des, 8, results + 16, plaintext);
+#else
   DES_key_schedule ks;
 
   setup_des_key(keys, DESKEY(ks));
@@ -216,6 +253,7 @@ calc_resp(unsigned char *keys, unsigned char *plaintext, unsigned char *results)
   setup_des_key(keys+14, DESKEY(ks));
   DES_ecb_encrypt((DES_cblock*) plaintext, (DES_cblock*) (results+16),
   DESKEY(ks), DES_ENCRYPT);
+#endif
 }
 
 /*
@@ -255,6 +293,15 @@ mkhash(const char *password,
 
   {
 /* create LanManager hashed password */
+#ifdef 

Re: [Bug-wget] [Bug-Wget] Wrong Error Codes returned on Digest Auth Failures.

2013-07-16 Thread Tim Rühsen
Am Dienstag, 16. Juli 2013, 20:52:01 schrieb Darshit Shah:

 There are two regions that I would like to draw attention to:
 1. http.c:3752 : The code is quite redundant and I would prefer that it was 
 somehow merged. Ideas on fixing this would be greatly appreciated! 

I guess you are talking about calculating 'hash' two times when algorithm=md5-
sess.

That is indeed unneeded. It should be like:

if (algorithm  !strcmp (algorithm, MD5-sess))
  {
/* A1BUF = H( H(user : realm : password) : nonce : cnonce ) */
... calc hash
   }
else
   {
 /* A1BUF = H(user : realm : password) */
 ... calc hash
   }

dump_hash (a1buf, hash);

Fix it, test it and if it's good, submit a patch ;-)

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug-Wget] Wrong Error Codes returned on Digest Auth Failures.

2013-07-18 Thread Tim Rühsen
Am Donnerstag, 18. Juli 2013, 02:32:21 schrieb Darshit Shah:

 I see no wasted cycles in here. 

Sorry, my mistake.

And you are right, a bit code cleanup wouldn't be too bad.

Tim


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] --spider -r creates directories, wanted behaviour ?

2013-07-20 Thread Tim Rühsen
Hi,

while playing around with the test suite, I realized that --spider -r creates 
a directory structure. It's the directories where Wget would save files without 
--spider.

Is that intended behaviour (feature) or is it a bug ?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] --spider -r creates directories, wanted behaviour ?

2013-07-20 Thread Tim Rühsen
Am Samstag, 20. Juli 2013, 23:49:11 schrieb Darshit Shah:

 When using spider, I guess this should be classified as a bug. 


 I'll see if I can look into fixing it. I will add a test for the same
 nonetheless in the Suite I am working on.

I just took a quick look: the same applies to --delete-after -r.

The correct way to cleanup the directories would be to bookkeep all 
successfully created directories and remove them after all files were removed.
That won't remove previously existing directories.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] NTLM auth broken in 1.13.4

2013-07-23 Thread Tim Rühsen
Am Montag, 22. Juli 2013, 23:13:17 schrieb Darshit Shah:


This patch seems to break for normal builds. 

I get the following error on running make:


configure: error: conditional HAVE_NETTLE was never defined.
Usually this means the macro was only invoked conditionally. 
make: *** [config.status] Error 1


After
AC_CHECK_LIB(nettle, ...
the variable $HAVE_NETTE should/must be defined.

The above error puzzles me.
Did you do an 'autoreconf' after you locally applied the commit ?
Could someone explain that error to me ?

Regards, Tim



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] NTLM auth broken in 1.13.4

2013-07-23 Thread Tim Rühsen
Am Dienstag, 23. Juli 2013, 13:43:13 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
  The above error puzzles me.
  
  Did you do an 'autoreconf' after you locally applied the commit ?
  
  Could someone explain that error to me ?
 
 I think you can reproduce it when you try to build with openssl instead
 of gnutls.
 
 What do you get when you use ./configure --with-ssl=openssl?
 
 Moving the AM_CONDITIONAL line seems to fix it.

You changed my original patch in a way, that you won't need AM_CONDITIONAL any 
more. Try commenting it out - it should work.

Regards, Tim

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] NTLM auth broken in 1.13.4

2013-07-23 Thread Tim Rühsen
Am Dienstag, 23. Juli 2013, 15:47:35 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
  You changed my original patch in a way, that you won't need AM_CONDITIONAL
  any more. Try commenting it out - it should work.
 
 Thanks, it seems to work here.  Are you ok with this commit?
 
 I have also added a missing entry for the ChangeLog file.

Thanks, perfect.

Regards, Tim

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] New option: --rename-output: modify output filename with perl

2013-07-26 Thread Tim Rühsen
Thank you for your work, Andrew !

In general, I like the idea of being able to read and/or modify the filenames.

Just for discussion: What about a more slightly extended option:

Call an external program after downloading and saving, not only with filename 
but also with additional information (e.g. HTTP header stuff like content-type 
etc.). This external program would be able to rename the file (and telling this 
Wget via pipe output as Andrew suggested), but also being able to analyze the 
file content, saving meta-infos into a database, extract and execute 
javascript, etc.

Two thoughts:
- the whole idea is not relevant for single downloads.
- the above things could be done by analysing Wgets debugging output. I have 
done this several times. But the debugging output is not documented, so these 
solutions are hacks and might break with the next version of Wget.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] GnuTLS certificate loading

2013-08-03 Thread Tim Rühsen
Am Samstag, 3. August 2013, 00:14:38 schrieb Ángel González:
 On 02/08/13 16:11, Tim Ruehsen wrote:
  Hi,
  
  I realized that gnutls.c loads every file it can find in the given
  ca_directory (default: /etc/ssl/certs).
  
  For me (on Debian SID) it means, every certificate is loaded 4 times !
  
  Example Visa certificate:
  ~/src/wget/src$ l /etc/ssl/certs|grep Visa
  lrwxrwxrwx 1 root root 23 11-06-13 08:40:39 6fcc125d.0 -
  Visa_eCommerce_Root.pem
  lrwxrwxrwx 1 root root 23 11-06-13 08:40:39 a760e1bd.0 -
  Visa_eCommerce_Root.pem
  lrwxrwxrwx 1 root root 58 27-10-11 09:39:52 Visa_eCommerce_Root.pem -
  /usr/share/ca-certificates/mozilla/Visa_eCommerce_Root.crt
 
 I wonder why you have two different hashes for the same file. Maybe one
 of them
 comes from an old Visa_eCommerce_Root.crt ?
 Those hashes are normally created by c_rehash(1)

Well, I don't know. But calling c_rehash creates two sums per file:

root@debian:~# c_rehash /etc/ssl/certs/
Doing /etc/ssl/certs/
Camerfirma_Global_Chambersign_Root.pem = cb59f961.0
Camerfirma_Global_Chambersign_Root.pem = a0bc6fbb.0
Chambers_of_Commerce_Root_-_2008.pem = c47d9980.0
Chambers_of_Commerce_Root_-_2008.pem = 1eb37bdf.0
A-Trust-nQual-03.pem = 9c472bf7.0
A-Trust-nQual-03.pem = c3a6a9ad.0
...


  That is 3 times plus loading of ca-certificates.crt kept in
  /etc/ssl/certs/, which seems to contain all certificates from
  /etc/ssl/certs.
 
 Almost. It contains all certificates activated in
 /etc/ca-certificates.conf (all, by
 default). See update-ca-certificates(8)

Good to know, thank you.


  It would be easy to fix that, if backwards compatibility wasn't an issue:
  1. If we just load *.pem files, we would miss *.crt files
  2. If we just load *.crt files, we would miss *.pem files
  3. If we load both *.pem and *.crt files, we also load aggregations like
  ca- certificates.crt (loading certs twice).
 
 We are obtaining the final inode in the stat(). We should keep a list of
 loaded
 inodes to avoid loading the same file several times.
 Although that wouldn't fix the duplication with aggregations.

Thats a good idea. I implement that next week, using Wget's hashtable stuff.
At least for user-provided directories or if 
gnutls_certificate_set_x509_system_trust() is not available.

  My favorite would be to use
  
  gnutls_certificate_set_x509_system_trust()
  
  for the default case (opt.ca_certificate == NULL) instead of the
  hard-coded
  /etc/ssl/certs/. This function loads all certs from the 'system' certs
  directory just once.
 
 Looks good.
 
  For a user-provided cert directory, we should keep the current behavior of
  loading every file in the directory. Anything else may break Wget
  compatibility.
  
  I already have made the changes, but would like to see comments and/or
  opinions.

Thanks for your response.

Tim



signature.asc
Description: This is a digitally signed message part.


[Bug-wget] [PATCH] GnuTLS certificate loading

2013-08-03 Thread Tim Rühsen
Some improvements to gnutls.c, especially improved certificate loading.

Regards, Tim
From 1194317f35a014c878526dc3d2ada55ebd5fd6de Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Sat, 3 Aug 2013 19:56:39 +0200
Subject: [PATCH] gnutls improvements

---
 src/ChangeLog |   7 
 src/gnutls.c  | 131 +-
 2 files changed, 91 insertions(+), 47 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index fe7ce5f..fcb931f 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,10 @@
+2013-07-13  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* gnutls.c (ssl_init): Prevent CA files from being loaded twice
+  if possible.
+* gnutls.c (ssl_check_certificate): Added some error messages
+* gnutls.c: Fixed some compiler warnings
+
 2013-07-16  Darshit Shah  dar...@gmail.com
 
 	* wget.h (err_t): Added new errors, ATTRMISSING and UNKNOWNATTR to
diff --git a/src/gnutls.c b/src/gnutls.c
index d90f46a..1661fb3 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -46,6 +46,7 @@ as that of the covered work.  */
 #include connect.h
 #include url.h
 #include ptimer.h
+#include hash.h
 #include ssl.h
 
 #include sys/fcntl.h
@@ -81,49 +82,79 @@ ssl_init (void)
 {
   /* Becomes true if GnuTLS is initialized. */
   static bool ssl_initialized = false;
+  const char *ca_directory;
+  DIR *dir;
+  int ncerts = -1;
 
   /* GnuTLS should be initialized only once. */
   if (ssl_initialized)
 return true;
 
-  const char *ca_directory;
-  DIR *dir;
-
   gnutls_global_init ();
   gnutls_certificate_allocate_credentials (credentials);
   gnutls_certificate_set_verify_flags(credentials,
   GNUTLS_VERIFY_ALLOW_X509_V1_CA_CRT);
 
-  ca_directory = opt.ca_directory ? opt.ca_directory : /etc/ssl/certs;
+#if GNUTLS_VERSION_MAJOR = 3
+  if (!opt.ca_directory)
+ncerts = gnutls_certificate_set_x509_system_trust(credentials);
+#endif
 
-  dir = opendir (ca_directory);
-  if (dir == NULL)
+  /* If GnuTLS version is too old or CA loading failed, fallback to old behaviour.
+	* Also use old behaviour if the CA directory is user-provided */
+  if (ncerts = 0)
 {
-  if (opt.ca_directory  *opt.ca_directory)
-logprintf (LOG_NOTQUIET, _(ERROR: Cannot open directory %s.\n),
-   opt.ca_directory);
-}
-  else
-{
-  struct dirent *dent;
-  while ((dent = readdir (dir)) != NULL)
+  ca_directory = opt.ca_directory ? opt.ca_directory : /etc/ssl/certs;
+
+  if ((dir = opendir (ca_directory)) == NULL)
 {
-  struct stat st;
-  char *ca_file;
-  asprintf (ca_file, %s/%s, ca_directory, dent-d_name);
+  if (opt.ca_directory  *opt.ca_directory)
+logprintf (LOG_NOTQUIET, _(ERROR: Cannot open directory %s.\n),
+   opt.ca_directory);
+}
+  else
+{
+  struct hash_table *inode_map = hash_table_new (196, NULL, NULL);
+  struct dirent *dent;
+  size_t dirlen = strlen(ca_directory);
+  int rc;
 
-  stat (ca_file, st);
+  ncerts = 0;
 
-  if (S_ISREG (st.st_mode))
-gnutls_certificate_set_x509_trust_file (credentials, ca_file,
-GNUTLS_X509_FMT_PEM);
+  while ((dent = readdir (dir)) != NULL)
+{
+  struct stat st;
+  char ca_file[dirlen + strlen(dent-d_name) + 2];
 
-  free (ca_file);
-}
+  snprintf (ca_file, sizeof(ca_file), %s/%s, ca_directory, dent-d_name);
+
+  if (stat (ca_file, st) != 0)
+continue;
+
+  if (! S_ISREG (st.st_mode))
+continue;
+
+  /* avoid loading the same file twice by checking the inode */
+  if (hash_table_contains (inode_map, (void *)(intptr_t) st.st_ino))
+continue;
+
+  hash_table_put (inode_map, (void *)(intptr_t) st.st_ino, NULL);
+
+  if ((rc = gnutls_certificate_set_x509_trust_file (credentials, ca_file,
+GNUTLS_X509_FMT_PEM)) = 0)
+logprintf (LOG_NOTQUIET, _(ERROR: Failed to open cert %s: (%d).\n),
+   ca_file, rc);
+  else
+ncerts += rc;
+}
 
-  closedir (dir);
+  hash_table_destroy (inode_map);
+  closedir (dir);
+}
 }
 
+  DEBUGP ((Certificates loaded: %d\n, ncerts));
+
   /* Use the private key from the cert file unless otherwise specified. */
   if (opt.cert_file  !opt.private_key)
 {
@@ -278,7 +309,7 @@ wgnutls_read (int fd, char *buf, int bufsize, void *arg)
 }
 
 static int
-wgnutls_write (int fd, char *buf, int bufsize, void *arg)
+wgnutls_write (int fd _GL_UNUSED, char *buf, int bufsize, void *arg)
 {
   int ret;
   struct wgnutls_transport_context *ctx = arg;
@@ -315,7 +346,7 @@ 

Re: [Bug-wget] [PATCH] GnuTLS certificate loading

2013-08-05 Thread Tim Rühsen
Am Montag, 5. August 2013, 00:18:32 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
  Some improvements to gnutls.c, especially improved certificate loading.
 
 thanks for the patch but it doesn't seem to apply to origin/master.
 
 On what version is it based?  Could you please rebase it on master?

I don't know what is wrong:

tim@debian:~/src/wget/trunk$ git pull
Already up-to-date.
tim@debian:~/src/wget/trunk$ git branch --list -a
* master
  parallel-wget
  remotes/origin/HEAD - origin/master
  remotes/origin/master
  remotes/origin/parallel-wget

What can I do ?
I admit, that I regularly have problems with syncing repositories (git pull 
fails, actions with fetch/merge resulting in chaos), ending up in rm -r and 
git clone.

I care for the rest as soon as this is cleared out.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] GnuTLS certificate loading

2013-08-07 Thread Tim Rühsen
Am Mittwoch, 7. August 2013, 00:18:25 schrieb Giuseppe Scrivano:
 Hi Tim,
 
 Tim Rühsen tim.rueh...@gmx.de writes:
  I don't know what is wrong:
  
  tim@debian:~/src/wget/trunk$ git pull
  Already up-to-date.
  tim@debian:~/src/wget/trunk$ git branch --list -a
  * master
  
parallel-wget
remotes/origin/HEAD - origin/master
remotes/origin/master
remotes/origin/parallel-wget
  
  What can I do ?
  I admit, that I regularly have problems with syncing repositories (git
  pull
  fails, actions with fetch/merge resulting in chaos), ending up in rm -r
  and
  git clone.
 
 how does git log look for you?

tim@debian:~/src/wget/trunk$ git log
commit 1194317f35a014c878526dc3d2ada55ebd5fd6de
Author: Tim Ruehsen tim.rueh...@gmx.de
Date:   Sat Aug 3 19:56:39 2013 +0200

gnutls improvements

commit 76ad566279df2be6ab2229621ecd1f32cbc043a3
Merge: ec04e74 ffb9403
Author: Tim Ruehsen tim.rueh...@gmx.de
Date:   Wed Jul 24 22:44:06 2013 +0200

Merge branch 'master' of git://git.savannah.gnu.org/wget

commit ec04e74ff893f19b1e885fda24cf4d6f3fe9cd82
Author: Tim Ruehsen tim.rueh...@gmx.de
Date:   Wed Jul 24 22:44:01 2013 +0200

xxx

commit ffb94036f2116649a8de1a930820056aea9cb65f
Author: Tim Ruehsen tim.rueh...@gmx.de
Date:   Tue Jul 23 15:45:30 2013 +0200

openssl: fix build.

commit e22095a7641c8a74ed6b3566ad96e2bbb99258c7
Merge: 563cd95 92035db
Author: Tim Ruehsen tim.rueh...@gmx.de
Date:   Tue Jul 23 12:49:12 2013 +0200

Merge branch 'master' of git://git.savannah.gnu.org/wget

Conflicts:
configure.ac
src/http-ntlm.c
src/http.c

commit 92035dbabd8bb7fc3d10535c97adad2088c9129b
Author: Darshit Shah dar...@gmail.com 
   
Date:   Mon Jul 22 19:35:53 2013 +0530  
   

Fix erroneous error codes when HTTP Digest Authentication fails.

commit c19d76c02483f070beb688d6fe6f5fafb5674a08
Author: Tim Ruehsen tim.rueh...@gmx.de
Date:   Mon Jul 22 13:12:57 2013 +0200

ntlm: support libnettle.

commit 563cd95e11abe9ed28b60e3ac00ca09207523ec7
Merge: e296e66 a300f1e
Author: Tim Ruehsen tim.rueh...@gmx.de
Date:   Sun Jul 14 20:08:49 2013 +0200

Merge branch 'master' of git://git.savannah.gnu.org/wget

Conflicts:
src/ChangeLog
src/gnutls.c
src/http.c

commit e296e66ecc2ec7ce9d4ff935640d169861c77174
Author: Tim Ruehsen tim.rueh...@gmx.de
Date:   Sun Jul 14 19:58:53 2013 +0200

Check return value of create_authorization_line

commit a300f1e47d12877b13cb661a9742de443232dc1a
Author: Giuseppe Scrivano gscriv...@gnu.org
Date:   Fri Jul 12 23:44:21 2013 +0200

Fix some memory leaks a problem introduced with the last commit

commit 72b2c58983a63849acede55b9fc619c76d61bdae
Author: Steven M. Schweda s...@antinode.info
Date:   Sat Jul 13 12:00:30 2013 +0200

warc: Fix some portability issues on VMS.


 What happens if you try these two commands?
 
 $ git fetch -a
 $ git pull --rebase origin master

tim@debian:~/src/wget/trunk$ git fetch -a
tim@debian:~/src/wget/trunk$ git pull --rebase origin master
From git://git.savannah.gnu.org/wget
 * branchmaster - FETCH_HEAD
First, rewinding head to replay your work on top of it...
Applying: add connect timeout to gnutls code
Using index info to reconstruct a base tree...
M   src/ChangeLog
M   src/gnutls.c
Falling back to patching base and 3-way merge...
Auto-merging src/gnutls.c
CONFLICT (content): Merge conflict in src/gnutls.c
Auto-merging src/ChangeLog
CONFLICT (content): Merge conflict in src/ChangeLog
Failed to merge in the changes.
Patch failed at 0001 add connect timeout to gnutls code
The copy of the patch that failed is found in:
   /home/tim/src/wget/trunk/.git/rebase-apply/patch

When you have resolved this problem, run git rebase --continue.
If you prefer to skip this patch, run git rebase --skip instead.
To check out the original branch and stop rebasing, run git rebase --abort.

[fixed conflicts]

tim@debian:~/src/wget/trunk$ git rebase --continue
src/ChangeLog: needs merge
src/gnutls.c: needs merge
You must edit all merge conflicts and then
mark them as resolved using git add

tim@debian:~/src/wget/trunk$ git add src/ChangeLog src/gnutls.c
tim@debian:~/src/wget/trunk$ git rebase --continue
Applying: add connect timeout to gnutls code
Applying: Check return value of create_authorization_line
Using index info to reconstruct a base tree...
M   src/http.c
stdin:24: trailing whitespace.
  
warning: 1 line adds whitespace errors.
Falling back to patching base and 3-way merge...
Auto-merging src/http.c
CONFLICT (content): Merge conflict in src/http.c
Failed to merge in the changes.
Patch failed at 0002 Check return value of create_authorization_line
The copy of the patch that failed is found in:
   /home/tim/src/wget/trunk/.git/rebase-apply/patch

When you

Re: [Bug-wget] Review Request (Bug 39453)

2013-08-07 Thread Tim Rühsen
Am Mittwoch, 7. August 2013, 08:24:35 schrieb Will Dietz:
 Hi all,
 
 There's a minor integer error in wget as described in the following bug
 report:
 
 https://savannah.gnu.org/bugs/?39453
 
 Patch is included, please review.
 
 Thanks!

Hi Will,

isn't the real problem a signed/unsigned comparison ?

If remaining_chars becomes negative (due to token is longer or equal to 
line_length), the comparison
  if (remaining_chars = strlen (token))
is false or at least undefined.

If we change it to
  if (remaining_chars = (int) strlen (token))
the function should work.

Using gcc -Wsign-compare warns about such constructs.

Isn't there another bug, when setting
remaining_chars = line_length - TABULATION;
?
line_length might already be without TABULATION:
  if (line_length = 0)
line_length = MAX_CHARS_PER_LINE - TABULATION;

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] GnuTLS certificate loading

2013-08-09 Thread Tim Rühsen
I deleted and git-cloned the complete repository.
And moved my copy of gnutls.c into src.

  +#include hash.h
 can you please add an explicit dependency to the hash module in
 bootstrap.conf?

??? it is the hash.h from src directory. Why and where should it go into 
bootstrap.conf ?

Regards, Tim
From 0fd97ed11d9f740de78d540a13db40548c8d7391 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Fri, 9 Aug 2013 21:56:01 +0200
Subject: [PATCH] gnutls improvements

---
 src/ChangeLog |   7 
 src/gnutls.c  | 133 +-
 2 files changed, 92 insertions(+), 48 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index ab38d45..edfb80f 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,10 @@
+2013-08-09  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* gnutls.c (ssl_init): Prevent CA files from being loaded twice
+	  if possible.
+	* gnutls.c (ssl_check_certificate): Added some error messages
+	* gnutls.c: Fixed some compiler warnings
+
 2013-08-08  Will Dietz  w...@wdtz.org (tiny change):
 
 	* main.c (format_and_print_line): Wrap correctly long tokens.
diff --git a/src/gnutls.c b/src/gnutls.c
index 06f9020..06e263e 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -46,6 +46,7 @@ as that of the covered work.  */
 #include connect.h
 #include url.h
 #include ptimer.h
+#include hash.h
 #include ssl.h
 
 #include sys/fcntl.h
@@ -81,49 +82,79 @@ ssl_init (void)
 {
   /* Becomes true if GnuTLS is initialized. */
   static bool ssl_initialized = false;
+  const char *ca_directory;
+  DIR *dir;
+  int ncerts = -1;
 
   /* GnuTLS should be initialized only once. */
   if (ssl_initialized)
 return true;
 
-  const char *ca_directory;
-  DIR *dir;
-
   gnutls_global_init ();
   gnutls_certificate_allocate_credentials (credentials);
-  gnutls_certificate_set_verify_flags(credentials,
-  GNUTLS_VERIFY_ALLOW_X509_V1_CA_CRT);
+  gnutls_certificate_set_verify_flags (credentials,
+   GNUTLS_VERIFY_ALLOW_X509_V1_CA_CRT);
 
-  ca_directory = opt.ca_directory ? opt.ca_directory : /etc/ssl/certs;
+#if GNUTLS_VERSION_MAJOR = 3
+  if (!opt.ca_directory)
+ncerts = gnutls_certificate_set_x509_system_trust (credentials);
+#endif
 
-  dir = opendir (ca_directory);
-  if (dir == NULL)
+  /* If GnuTLS version is too old or CA loading failed, fallback to old behaviour.
+	* Also use old behaviour if the CA directory is user-provided */
+  if (ncerts = 0)
 {
-  if (opt.ca_directory  *opt.ca_directory)
-logprintf (LOG_NOTQUIET, _(ERROR: Cannot open directory %s.\n),
-   opt.ca_directory);
-}
-  else
-{
-  struct dirent *dent;
-  while ((dent = readdir (dir)) != NULL)
+  ca_directory = opt.ca_directory ? opt.ca_directory : /etc/ssl/certs;
+
+  if ((dir = opendir (ca_directory)) == NULL)
 {
-  struct stat st;
-  char *ca_file;
-  asprintf (ca_file, %s/%s, ca_directory, dent-d_name);
+  if (opt.ca_directory  *opt.ca_directory)
+logprintf (LOG_NOTQUIET, _(ERROR: Cannot open directory %s.\n),
+   opt.ca_directory);
+}
+  else
+{
+  struct hash_table *inode_map = hash_table_new (196, NULL, NULL);
+  struct dirent *dent;
+  size_t dirlen = strlen(ca_directory);
+  int rc;
 
-  stat (ca_file, st);
+  ncerts = 0;
 
-  if (S_ISREG (st.st_mode))
-gnutls_certificate_set_x509_trust_file (credentials, ca_file,
-GNUTLS_X509_FMT_PEM);
+  while ((dent = readdir (dir)) != NULL)
+{
+  struct stat st;
+  char ca_file[dirlen + strlen(dent-d_name) + 2];
 
-  free (ca_file);
-}
+  snprintf (ca_file, sizeof(ca_file), %s/%s, ca_directory, dent-d_name);
+
+  if (stat (ca_file, st) != 0)
+continue;
+
+  if (! S_ISREG (st.st_mode))
+continue;
+
+  /* avoid loading the same file twice by checking the inode */
+  if (hash_table_contains (inode_map, (void *)(intptr_t) st.st_ino))
+continue;
+
+  hash_table_put (inode_map, (void *)(intptr_t) st.st_ino, NULL);
+
+  if ((rc = gnutls_certificate_set_x509_trust_file (credentials, ca_file,
+GNUTLS_X509_FMT_PEM)) = 0)
+logprintf (LOG_NOTQUIET, _(ERROR: Failed to open cert %s: (%d).\n),
+   ca_file, rc);
+  else
+ncerts += rc;
+}
 
-  closedir (dir);
+  hash_table_destroy (inode_map);
+  closedir (dir);
+}
 }
 
+  DEBUGP ((Certificates loaded: %d\n, ncerts));
+
   /* Use the private key from the cert file unless otherwise specified. */
   if (opt.cert_file  

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-12 Thread Tim Rühsen
Am Donnerstag, 12. September 2013, 12:59:00 schrieb Björn Mattsson:
 Run into a bug in wget last week.
 Done some digging but can't solve it by my self.
 
 If i tries to wget a file containing capital ÅÄÖ they gets coverted
 wrongly, and åäö works fine.
 
 I uses wget -m to backup one of my webb-sites to another machine. Have
 worked like a cahrm for the last 4-5 years but a couple of week ago one
 of teh files came down wrong. Thought it was a college that had uploaded
 something wrong but after some digging it's wget that converts wrongly.
 
 I have UTF-8 as charset on my machine.
 
 If you want to test/see the problem
 
 wget -m http://bmit.se/wget

Just use 
wget --restrict-file-names=nocontrol -m http://bmit.se/wget

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-12 Thread Tim Rühsen
Am Donnerstag, 12. September 2013, 17:37:17 schrieb Tim Ruehsen:
 On Thursday 12 September 2013 12:59:00 Björn Mattsson wrote:
  Run into a bug in wget last week.
  Done some digging but can't solve it by my self.
  
  If i tries to wget a file containing capital ÅÄÖ they gets coverted
  wrongly, and åäö works fine.
  
  I uses wget -m to backup one of my webb-sites to another machine. Have
  worked like a cahrm for the last 4-5 years but a couple of week ago one
  of teh files came down wrong. Thought it was a college that had uploaded
  something wrong but after some digging it's wget that converts wrongly.
  
  I have UTF-8 as charset on my machine.
  
  If you want to test/see the problem
  
  wget -m http://bmit.se/wget
 
 A request to http://bmit.se/wget/ returns text/html document without
 specifying the charset (AFAIR, default is iso-8859-1).
 Either your Server has to tag the response as utf-8 (Content-Type:
 text/html; charset=utf-8) or you have to specify utf-8 in your document
 header.
 
 Or you specify --remote-encoding=utf-8 when calling wget.
 
 Could you give it a try, maybe with -d to see what is going on.

Sorry, forget my answer.
Meanwhile I could make some tests in an utf-8 env, and yes, Wget 1.14 (Debian 
package as well as current git) has the problem you described.

I am not shure if we can change it without breaking backward compatibility !?

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] wget error: failed: Connection timed out.

2013-09-30 Thread Tim Rühsen
 I can download link through any browser, the link is 
 http://developer.blackberry.com/native/downloads/fetch/BlackBerry10Simulator
 -Installer-BB10_2_0X-1155-Win-201308081613.exe 
 
 I installed wsproxy.exe in my c:\windows folder and get following error:

Just a guess.

You are behind a firewall and have to use a proxy.
The current Wget has -e http_proxy=host:port or uses the environment variable 
http_proxy. If you need a username/password for your proxy, use --proxy-user 
and/or --proxy-passwd.

Consult the Wget docs for details.
Have a look into your browsers preferences to find the proxy settings.

Good luck.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [ PATCH ] LIST changes (ver. 2)

2013-10-17 Thread Tim Rühsen
Am Donnerstag, 17. Oktober 2013, 12:55:18 schrieb Andrea Urbani:
 Hi,
 first of all I'm sorry: I was not subscribed to the bug-wget list so I saw
 only yesterday the replies of other users.
 
 Well, this patch replaces the previous ones from me.
 
 Now wget, after the SYST command, looks if it knows that system.
 If yes, wget will force the use of LIST or LIST -a.
 If no, wget will try, only the first time of each session, before the
 LIST -a command and after the LIST.
 If LIST -a works and returns more or equal data of the LIST,
 LIST -a will be the standard list command for all the session.
 If LIST -a fails or returns less data than LIST (think on the case
 of an existing file called -a), LIST will be the standard list
 command for all the session.
 
 Well, there is an unhandled situation (that I will not fix, at least
 now): I'm on an unknown system that recognise LIST -a as give me the
 -a files/folders, I have to download files from different folders and
 the starting ftp folder contains only one -a folder and no . and
 .. folders are returned ! :-O)
 In this case wget will try LIST -a then LIST. The result will be
 the same so LIST -a will be taken, but, as soon as wget will go
 inside the -a folder, the problems will begin...
 
 About the look for known systems I force LIST when the system is ST_VMS or
 exactly 215 UNIX MultiNet Unix Emulation V5.3(93). If the system is like
 215 UNIX Type: L8 I force LIST -a.
 In all the other systems, I try LIST -a and after LIST (only the first
 time). I don't force LIST for ST_WINNT because in ftp-ls.c is written,
 inside ftp_parse_ls,
 
 /* Detect whether the listing is simulating the UNIX format */
 
 so there are strange situations there, that I can't test.
 
 About MultiNet I have written to the developers to know if I can check a
 more general 215 UNIX MultiNet  or not.
 
 I have tested the sites:
 ftp://ftp.info-zip.org/
 ftp://ftp.freebsd.org/
 ftp://antinode.info/moz_test/
 ftp://ftp.microsoft.com/
 ftp://ftp.adobe.com/
 ftp://ftp.gnu.org/
 ftp://ftp.ncftp.com/
 
 I have also added the following test cases:
 
  * Test-ftp-list-Multinet.px: Test LIST on a UNIX MultiNet
  Unix Emulation system that returns an empty content when
  LIST -a is requested (probably because no -a files
  exist)
 
  * Test-ftp-list-Unknown.px: Test LIST on a Unknown ftp
  service system that returns an empty content when
  LIST -a is requested (probably because no -a files
  exist)
 
  * Test-ftp-list-Unknown-a.px: Test LIST on a Unknown ftp
  service system that recognises LIST -a as give me the
  -a file and there is a -a file + other two files.
  LIST -a will return only -a, LIST all the three files.
 
  * Test-ftp-list-Unknown-hidden.px: Test LIST on a Unknown ftp
  service system that recognises LIST -a as an UNIX Type:
  L8 system (show me also the hidden files) and there is an
  hidden file.
 
  * Test-ftp-list-Unknown-list-a-fails.px: Test LIST on a
  Unknown ftp service system that raises an error on
  LIST -a command.
 
  * Test-ftp-list-UNIX-hidden.px: Test LIST on a UNIX Type:
  L8 system that recognises LIST -a as show me also the
  hidden files and there is an hidden file.
 
 Everything should be ok. If not, let me know. (Now I'm subscribed to
 bug-wget)

Hey Andrea,

nice work.

But I just can't apply your patch to up-to-date Wget sources (patch throws 
messages at me and thereafter Wget is not compilable at all).

Maybe you can commit all your changes locally and create a patch with
git format-patch -1
?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] I am seeing problems with wget-1.14.96-38327 doing gnutls secure sessions.

2013-11-04 Thread Tim Rühsen
Hi Sci-Fi @ hush.ai, found a prob on your XPI (nice rhyme !)

You problem is reproducable here by using
-e timeout=20 -e check-certificate=off 

A workaround is 
-e timeout=0

It must be some sort of regression, as you say.
I have no time to dig, but maybe my observation might help someone to find it.


 Certificates loaded: -1250
? Holy sheepshit, what is this ?
GNUTLS_E_UNIMPLEMENTED_FEATURE returned by 
gnutls_certificate_set_x509_system_trust().

Fixed in attached patch.

Tim


Am Montag, 4. November 2013, 16:36:56 schrieb SciFi:
 Hi,
 
 (I am still here, still running OSX 10.6.8
  with all security updates etc.)
 
 I've compiled the 1.14.96-38327 tarball here.
 
 With it, I'm suddenly getting retries when I need to
 fetch something with https
 (while regular http seems ok)
 no matter what server I need to pull from.
 
 I also updated gnutls to 3.2.6
 and nettle to 2.7
 just in case
 but no help in this regard.
 
 For example, here's a wget of
 the nightly Enigmail build
 
 in debug mode:
  $ wget -d 
  https://www.enigmail.net/download/nightly/enigmail-nightly-all.xpi DEBUG
  output created by Wget 1.14.96-38327 on darwin10.8.0.
  
  URI encoding = ‘UTF-8’
  --2013-11-04 10:06:45-- 
  https://www.enigmail.net/download/nightly/enigmail-nightly-all.xpi
  Certificates loaded: -1250
  Resolving www.enigmail.net (www.enigmail.net)... 217.26.54.154
  Caching www.enigmail.net = 217.26.54.154
  Connecting to www.enigmail.net (www.enigmail.net)|217.26.54.154|:443...
  connected. Created socket 4.
  Releasing 0x01091670 (new refcount 1).
  WARNING: No certificate presented by www.enigmail.net.
  
  ---request begin---
  GET /download/nightly/enigmail-nightly-all.xpi HTTP/1.1
  User-Agent: Wget/1.14.96-38327 (darwin10.8.0)
  Accept: */*
  Host: www.enigmail.net
  Connection: Keep-Alive
  
  ---request end---
  HTTP request sent, awaiting response... Read error (Success.) in headers.
  Retrying.
  
  --2013-11-04 10:06:47--  (try: 2) 
  https://www.enigmail.net/download/nightly/enigmail-nightly-all.xpi Found
  www.enigmail.net in host_name_addresses_map (0x1091670)
  Connecting to www.enigmail.net (www.enigmail.net)|217.26.54.154|:443...
  connected. Created socket 4.
  Releasing 0x01091670 (new refcount 1).
  WARNING: No certificate presented by www.enigmail.net.
  
  ---request begin---
  GET /download/nightly/enigmail-nightly-all.xpi HTTP/1.1
  User-Agent: Wget/1.14.96-38327 (darwin10.8.0)
  Accept: */*
  Host: www.enigmail.net
  Connection: Keep-Alive
  
  ---request end---
  HTTP request sent, awaiting response... Read error (Success.) in headers.
  Retrying.
  
  --2013-11-04 10:06:49--  (try: 3) 
  https://www.enigmail.net/download/nightly/enigmail-nightly-all.xpi Found
  www.enigmail.net in host_name_addresses_map (0x1091670)
  Connecting to www.enigmail.net (www.enigmail.net)|217.26.54.154|:443...
  connected. Created socket 4.
  Releasing 0x01091670 (new refcount 1).
  WARNING: No certificate presented by www.enigmail.net.
  
  ---request begin---
  GET /download/nightly/enigmail-nightly-all.xpi HTTP/1.1
  User-Agent: Wget/1.14.96-38327 (darwin10.8.0)
  Accept: */*
  Host: www.enigmail.net
  Connection: Keep-Alive
  
  ---request end---
  HTTP request sent, awaiting response... Read error (Success.) in headers.
  Retrying.
  
  ^C
 
 I can fetch this file ok
 with 1.14.96-38327
 if I use plain http.  ;)
 
 
 I saved the current stable 1.14 build of wget
 and it fetches from https ok.
 So this might be a regression of some sort.
 
 My ~/.wgetrc (for all wget versions/sessions shown here):
  $ cat ~/.wgetrc
  tries = 0
  continue = on
  timestamping = on
  timeout = 20
  waitretry = 5
  random_wait = on
  #inet4_only = on
  #prefer_family = IPv4
  retry_connrefused = on
  check-certificate = off
  trust-server-names = on
  #content-on-error = on
  auth-no-challenge = on
  ca-certificate = /usr/local/share/wget/cacert.pem
  robots = off
  #load-cookies = /Users/scifi/Library/Application
  Support/Camino/cookies.txt
 
 My compile parms:
  $ wget --version
  GNU Wget 1.14.96-38327 built on darwin10.8.0.
  
  +digest +https +ipv6 +iri +large-file +nls +ntlm +opie +ssl/gnutls
  
  Wgetrc:
  /Users/scifi/.wgetrc (user)
  /usr/local/etc/wgetrc (system)
  
  Locale:
  /usr/local/share/locale
  
  Compile:
  gcc-4.2 -DHAVE_CONFIG_H -DSYSTEM_WGETRC=/usr/local/etc/wgetrc
  -DLOCALEDIR=/usr/local/share/locale -I. -I../lib -I../lib
  -I/usr/local/ssl/include -I/usr/X11/include -I/usr/local/include
  -I/WhichXcode/Headers/FlatCarbon -I/usr/include
  -I/usr/local/include -Os -mtune=core2 -march=core2
  -force_cpusubtype_ALL -arch i386
  
  Link:
  gcc-4.2 -Os -mtune=core2 -march=core2 -force_cpusubtype_ALL -arch
  i386 -Os -mtune=core2 -march=core2 -force_cpusubtype_ALL -arch i386
  -L/usr/local/lib -L/usr/local/lib -liconv -L/usr/local/lib -lintl
  -Wl,-framework -Wl,CoreFoundation -lnettle -L/usr/local/lib
  -lgnutls -L/usr/local/ssl/lib 

Re: [Bug-wget] fix: wget hangs with -r and -O - (bug #40426)

2013-11-14 Thread Tim Rühsen
Am Montag, 11. November 2013, 18:06:53 schrieb daniele.cal...@tin.it:
 Hello,
 
 In attachment a fix of the bug #40426

Hi Daniele,

thanks for your contribution.

But it would be nice to have -O and -r working together.

Did you try to find out why Wget blocks ?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] fix: wget hangs with -r and -O - (bug #40426)

2013-11-14 Thread Tim Rühsen
Am Donnerstag, 14. November 2013, 21:00:13 schrieb Tim Rühsen:
 Am Montag, 11. November 2013, 18:06:53 schrieb daniele.cal...@tin.it:
  Hello,
  
  In attachment a fix of the bug #40426
 
 Hi Daniele,
 
 thanks for your contribution.
 
 But it would be nice to have -O and -r working together.
 
 Did you try to find out why Wget blocks ?

You are right in fixing it the quick way.
Since Wget's designers decided to first save to disk and then load again the 
same file, -O - and -r won't work together.

Wget is hanging because it reads from STDIN and waits for data.

To fix it, the downloaded file should stay in memory to be parsed OR be saved 
twice when -O - comes together with -r ...

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] wget seems to be out of touch with security (fails on most (all?) http websites...(where browsers work)

2013-12-20 Thread Tim Rühsen
Am Freitag, 20. Dezember 2013, 09:03:43 schrieb L Walsh:
 But at the end of the update script, I notice a message:
 if ($foundignored)
 {
print STDERR \n* = CA Certificates in /etc/ssl/certs are only seen by
 some legacy applications.
 To install CA-Certificates globally move them to /etc/pki/trust/ancors
 instead!\n; }
 
 Perhaps wget isn't using the new location?

Wget is using /etc/ssl/certs by default.

If the distribution uses a different directory, the package maintainer should 
change the default directory either by providing a patch or by specifying the 
directory in /etc/wgetrc.

Have a look into /etc/sl/certs and  /etc/pki/trust/ancors, which of them fits 
your needs.

Assuming you want /etc/pki/trust/ancors as the certificate directory, put it 
into /etc/wgetrc (or into ~/.wgetrc):

cadirectory=/etc/pki/trust/ancors


BTW, the 'Go Daddy' certs are named here (Debian SID) Go_Daddy_*

It is a good idea to submit a bug report for the wget package of your dist (if 
it hasn't already be done by someone else).

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] wget seems to be out of touch with security (fails on most (all?) http websites...(where browsers work)

2013-12-20 Thread Tim Rühsen
Am Freitag, 20. Dezember 2013, 13:54:12 schrieb Mike Frysinger:
 On Friday 20 December 2013 12:03:43 L Walsh wrote:
  Perhaps wget isn't using the new location?
 
 openssl manages its cert locations itself, not wget.  file a bug for your
 distro.

You are right.
What I wrote before about /etc/ssl/certs counts for Wget +gnutls only. Sorry.

Tim


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] [PATCH] Re: ping (Re: I am seeing problems with wget-1.14.96-38327 doing gnutls secure sessions.)

2013-12-26 Thread Tim Rühsen
Am Donnerstag, 26. Dezember 2013, 01:26:00 schrieb SciFi:
 ping
 
 I guess I need to remind about this bug,
 I haven't opened a real bugzilla report, tho.
 Shall I?
 
 FWIW, I've changed to the timeout=0 setting,
 which did let the httpS code work.
 I'll need to have a non-infinite setting
 for some projects I have that use wget.
 
 And I've hand-applied the patch below.
 No ill effects there.
 
 Happy Holidays!

The regression has been introduced by this change:

2013-05-05  mancha  manc...@hush.com (tiny change)

* gnutls.c (ssl_connect_wget): Don't abort on non-fatal alerts
received during handshake. For example, when connecting to servers
using TSL-SNI that send warning-level unrecognized_name alerts.

You could trigger it by compiling/linking with GnuTLS and using --connect-
timeout=x or -- timeout=x (x  0).

I attached a fix.

Tim
From 41f9db4f5d309d605d90613c1dd5c208be8024aa Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Thu, 26 Dec 2013 21:17:07 +0100
Subject: [PATCH] fix GnuTLS connect timeout

---
 src/ChangeLog | 4 
 src/gnutls.c  | 5 ++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index fe4c321..22d036c 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,7 @@
+2013-12-26  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* gnutls.c (ssl_connect_wget): Fix connect timeout failure
+
 2013-11-10  Giuseppe Scrivano  gscri...@redhat.com
 
 	* options.h (struct options) [!ENABLE_THREADS]: Define jobs.
diff --git a/src/gnutls.c b/src/gnutls.c
index 9b4b1ec..4f0fa96 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -526,8 +526,7 @@ ssl_connect_wget (int fd, const char *hostname)
   break;
 }
 
-  if (err = 0)
-break;
+   err = GNUTLS_E_AGAIN;
 }
   else if (err  0)
 {
@@ -543,7 +542,7 @@ ssl_connect_wget (int fd, const char *hostname)
 }
 }
 }
-  while (err == GNUTLS_E_WARNING_ALERT_RECEIVED  gnutls_error_is_fatal (err) == 0);
+  while (err  gnutls_error_is_fatal (err) == 0);
 
   if (opt.connect_timeout)
 {
-- 
1.8.5.2



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] General Testsuite issue

2014-01-17 Thread Tim Rühsen
Am Freitag, 17. Januar 2014, 11:42:41 schrieb Tony Lewis:
 Darshit Shah wrote:
  In case both the --config and --no-config commands are issued, the one
 
 that
 
  appears first on the command will be considered and the other ignored.
 
 Given my memory of the way the parsing loop works, I would expect that it
 would use the last one that appears. How do GNU commands usually handle
 multiple instances of a command option?

Wget, as most tools, parse the argument from left to right, the second 
overwriting the first. Else (e.g. if arguments 'sum up'),it should be 
explicitly mentioned in the docs.

Tim

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Overly permissive hostname matching

2014-03-18 Thread Tim Rühsen
Hi Jeffrey,

thanks for pointing this out.

BTW, to reproduce the issue I used a GnuTLS compiled/linked version of Wget:

$ wget -d --ca-certificate=ca-rsa-cert.pem --private-key=ca-rsa-key-plain.pem 
https://example.com:8443
2014-03-18 21:48:04 (1.88 GB/s) - Read error at byte 5116 (The TLS connection 
was non-properly terminated.).Retrying.

There seems to be a problem in Wget 1.15 (on Debian SID)...


But despite from that, Wget uses the hostname checking facility of the GnuTLS 
library (or of OpenSSL library if appropriately compiled). And I saw you 
already addressed bug-gnutls, which seems the right way to go.

IHMO, the Public Suffix List (PSL) should not only be used to verify cookies 
but 
also be used for certificate hostname checking.

Libraries as GnuTLS should offer an API for this kind of checking, best would 
be having the PSL as a separate file, maintained by the distribution 
maintainers (or the user, if he wants to to it). The SSL library should 
load/unload the PSL under the applications control.

Maybe it would be a good idea to provide a separate PSL library that could be 
used by SSL libraries for hostname checking and HTTP(S) clients for cookie 
verification.

If of any interest, there is already some LGPLed code at
  https://github.com/rockdaboot/mget/blob/master/libmget/cookie.c
There are also some unit test routines in the project.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Overly permissive hostname matching

2014-03-20 Thread Tim Rühsen
Am Mittwoch, 19. März 2014, 10:59:05 schrieb Daniel Kahn Gillmor:
 I'm imagining a C library API that has a public suffix list context
 object that can do efficient lookups (however we define the lookups),
 and the library would bundle a pre-compiled context, based on the
 currently-known public suffix list.
 
 something like:
 
 ---
 struct psl_ctx;
 typedef struct psl_ctx * psl_ctx_t;
 const psl_ctx_t psl_builtin;
 
 psl_ctx_t psl_new_ctx_from_filename(const char* filename);
 psl_ctx_t psl_new_ctx_from_fd(int fd);
 void psl_free_ctx(psl_ctx_t ctx);
 
 /*
   query forms, very rough draft -- do we need both?
   need to consider memory allocation responsibilities and
   DNS internationalization/canonicalization issues
 */
 
 const char* psl_get_public_suffix(const psl_ctx_t, const char* domain);
 const char* psl_get_registered_domain(const psl_ctx_t, const char* d);
 ---

I broke out the public suffix code together and created a first go (really very 
quick, distcheck fails - couldn't figure out this evening).

https://github.com/rockdaboot/libpsl

The first step was a psl_is_tld() function.
There is a test case for some major things (wildcards, exceptions).

I hope there will be some interest and some contributions...

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] libpsl discussion

2014-03-22 Thread Tim Rühsen
I created a google group mailing list for further libpsl discussion.

(hope is works)
https://groups.google.com/forum/#!forum/libpsl-bugs/join

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] libpsl design

2014-03-23 Thread Tim Rühsen
Am Samstag, 22. März 2014, 17:41:27 schrieb Daniel Kahn Gillmor:
  I would still like to move the discussion to libpsl-bugs, but so far
  nobody is reading it ...
 
 I've tried to subscribe, but apparently i have to be approved first.
 please approve me! :)

Sorry, that was my fault (a misconfiguration) though my test with a separate 
email worked fine (maybe it was also registered for my google account).

All join requests were accepted. Future subscriptions are accepted 
automatically (i tested that with a third email address).

I answer the main points of your email later...

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug-Wget] It's spring cleaning time!

2014-03-30 Thread Tim Rühsen
Am Sonntag, 30. März 2014, 16:00:07 schrieb Darshit Shah:
 Hello,
 
 I've been wanting to clean up the code for Wget for some time now.
 Today, I wrote a small script that compiles Wget with a bunch of
 warning flags and uploads each warning to GitHub as an issue.
 
 These issues have been created against the fork of Wget that I
 maintain on https://github.com/darnir/wget
 
 As of now, I compiled Wget using Clang and CFLAGS=-Wextra -Wall
 -pedantic -std=gnu89
 
 These flags end up spewing around 60 warnings, all of them semantic
 issues. The issues have been labeled based on the flag that caused
 them and the type of issue it is according to clang.
 
 I wanted to run clang with -Weverything, but that throws nearly 800+
 warnings. Hence, I figured, let's clean this up first and we can then
 begin looking at the other warnings. Some of them are probably false
 positives, but tohers seem to have some more subtle problems with
 platform independence.

I like the idea (always liked it) to fix all these warnings - I guess some of 
my patches were lost...

However you define false positive, they simply pollute the screen and when 
searching for real issues, you have to read through again and again. In this 
meaning, there is no false positive - except the compiler has a bug (the clang 
analyzer really finds some (cough) interesting things).

I once made a patch that fixed those 'const' warnings when compiling with -
DDEBUG (triggers selftest code). But I think it was not applied :-(

However, good idea, and I try to find some time !

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] warning about unknown .wgetrc directives

2014-04-04 Thread Tim Rühsen
Am Freitag, 4. April 2014, 17:14:07 schrieb Darshit Shah:
 On Fri, Apr 4, 2014 at 4:40 PM, Giuseppe Scrivano gscriv...@gnu.org wrote:
  Hi Karl,
  
  k...@freefriends.org (Karl Berry) writes:
   Giuseppe et al.,
   
   I suggest making unknown .wgetrc directives a warning (and just ignore
   them, proceeding on normally), rather than a failure.  For purposes of
   compatibility - a person might have a brand-new wget on system A, but
   for whatever reason, have to run an older wget on system B.  But it's
   convenient to have the same wget regardless.
  
  In general I tend to agree with you as it makes easier to reuse
  the .wgetrc file but I think problems with unknown directives should
  still be threated as errors.
  It may happen that we will add some security related directive,
  and while users rely on wget to honor that, wget instead will simply
  ignore it and give the impression it works.
  
  Unless we add something like --ignore-wgetrc-errors...
 
 I think that's over-engineering the problem.
 
 Some time ago, Tim, if I remember correctly proposed using version lines.
 So, newer commands can be marked as valid under a certain version only. The
 whole scheme can be made backward compatible by assuming the lack of a
 version line to imply the current version.

I thought it was about a protocol to support external programs...

But however: we can't fix Karl's problem for now... the machines running an old 
Wget won't update.

So we are talking about future versions and how we deal with such situations 
in the future. And here, a 'version' line could in fact help.

1. Reading a .wgetrc file with none or with a 'known' version: treat unknown 
directives as errors.

2. Reading a .wgetrc file with with a 'future' version: treat unknown 
directives as warnings and continue.

And to make 2. more intelligent, we could use unsharp comparisons:
e.g. we read 'remoteencding' (so something similar to  'remoteencoding') AND 
there is no exact 'remoteencoding' or '#remoteencoding' found in .wgetrc, we 
could error.
Maybe we should put this into a library... it would be useful for many tools 
;-)

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] bad filename

2014-04-24 Thread Tim Rühsen
Am Donnerstag, 24. April 2014, 20:00:18 schrieb Andries E. Brouwer:
 On Thu, Apr 24, 2014 at 03:43:40PM +0200, Tim Ruehsen wrote:
  1. How do you know, what filesystem you are writing to ?
  I just think of these fat32 USB sticks flying around everywhere.
  UTF-8 might be a problem (see
  http://en.wikipedia.org/wiki/Comparison_of_file_systems).
  I just mention fat32, because it is pretty common.
 
 Wget already knows about such restrictions.
 These high control bytes have no special status in FAT32,
 so not escaping them does not introduce any problems there.
 
  2. Backward compatibility.
 
 In this particular case I see no reason to expect any problems.

Well, I submitted my concerns as 'advocat of affected users' - and I am done 
:-)
Personally, I would like to see your proposed change as fast as possible.

[OT]
 My answer would be that case converting UTF-8 is something to avoid.
 For ASCII, case conversion is simple and well-defined.

The guess the problem is already solved. We have international domain names 
which must be converted to lowercase before encoded to 'punycode'.
And how should one case insensitive compare two unicode strings, if there is 
not solution ? If the turkish person can do it, a computer should be able to 
do it.

At least the greek Σ lowercase conversion is well defined.
Also the german 'ß' uppercase is 'ẞ' since 2008.
Maybe a few corner cases are still under construction.

Tim

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug-Wget] It's spring cleaning time!

2014-05-11 Thread Tim Rühsen
Am Sonntag, 11. Mai 2014, 15:56:15 schrieb Giuseppe Scrivano:
 Darshit Shah dar...@gmail.com writes:
  Subject: [PATCH] Fix LOTS of compiler warnings
 
 great work!  Just some minor comments:
  * http.c: Fix small memory leak
  
  diff --git a/src/css-url.c b/src/css-url.c
  index f97690d..51e43b4 100644
  --- a/src/css-url.c
  +++ b/src/css-url.c
  @@ -64,8 +64,8 @@ typedef struct yy_buffer_state *YY_BUFFER_STATE;
  
   extern YY_BUFFER_STATE yy_scan_bytes (const char *bytes,int len  );
   extern int yylex (void);
  
  -#if 1
  -const char *token_names[] = {
  +#if 0
  +static const char *token_names[] = {
  
 CSSEOF,
 S,
 CDO,
 
 if this is not needed, what about just dropping it?  I would prefer we
 keep code that is used somewhere.

Then let's drop it.
Due to version control, the code is not really lost... (if someone minds to 
use it again).

 
  -#define NTLMFLAG_NEGOTIATE_DOMAIN_SUPPLIED   (112)
  -#define NTLMFLAG_NEGOTIATE_WORKSTATION_SUPPLIED  (113)
  -#define NTLMFLAG_NEGOTIATE_LOCAL_CALL(114)
  -#define NTLMFLAG_NEGOTIATE_ALWAYS_SIGN   (115)
  -#define NTLMFLAG_TARGET_TYPE_DOMAIN  (116)
  -#define NTLMFLAG_TARGET_TYPE_SERVER  (117)
  -#define NTLMFLAG_TARGET_TYPE_SHARE   (118)
  -#define NTLMFLAG_NEGOTIATE_NTLM2_KEY (119)
  -#define NTLMFLAG_REQUEST_INIT_RESPONSE   (120)
  -#define NTLMFLAG_REQUEST_ACCEPT_RESPONSE (121)
  -#define NTLMFLAG_REQUEST_NONNT_SESSION_KEY   (122)
  -#define NTLMFLAG_NEGOTIATE_TARGET_INFO   (123)
  +/* #define NTLMFLAG_NEGOTIATE_DOMAIN_SUPPLIED   (112) */
  +/* #define NTLMFLAG_NEGOTIATE_WORKSTATION_SUPPLIED  (113) */
  +/* #define NTLMFLAG_NEGOTIATE_LOCAL_CALL(114) */
  +/* #define NTLMFLAG_NEGOTIATE_ALWAYS_SIGN   (115) */
  +/* #define NTLMFLAG_TARGET_TYPE_DOMAIN  (116) */
  +/* #define NTLMFLAG_TARGET_TYPE_SERVER  (117) */
  +/* #define NTLMFLAG_TARGET_TYPE_SHARE   (118) */
  +/* #define NTLMFLAG_NEGOTIATE_NTLM2_KEY (119) */
  +/* #define NTLMFLAG_REQUEST_INIT_RESPONSE   (120) */
  +/* #define NTLMFLAG_REQUEST_ACCEPT_RESPONSE (121) */
  +/* #define NTLMFLAG_REQUEST_NONNT_SESSION_KEY   (122) */
  +/* #define NTLMFLAG_NEGOTIATE_TARGET_INFO   (123) */
 
 same here, what about just dropping them?

Yes, same here.

Darshit, I guess it would be easiest if you change the patch !?

Tim




Re: [Bug-wget] [bug-wget] Libpsl for cookie domain checking in Wget

2014-06-01 Thread Tim Rühsen
Am Freitag, 30. Mai 2014, 22:24:19 schrieb Darshit Shah:
 I've attached a patch that adds support for using libpsl for cookie
 domain checking in Wget.
 
 The old heuristic checks still remain as a fallback. When the libpsl
 library on the system is built without the builtin list, Wget simply
 fallsback to the old heuristic checks. Similarly, if wget is built
 without libpsl support, it continues to use the old cookie domain
 checking code.
 
 I've removed the check for numeric addresses since it seems unneeded.
 The host and cookie_host variables will be compared for a full check
 either ways.

Hi Darhsit,

thanks for working on a libpsl integration into Wget.

And sorry for the version-in-library-name hassle (I tend to over-engeneering).

As you know, the consensus on the libpsl mailing list was to drop the version 
from the library name. The latest release did it and thus you have to update 
the patch again :-(

Regards, Tim




Re: [Bug-wget] [bug-wget] Libpsl for cookie domain checking in Wget

2014-06-11 Thread Tim Rühsen
Am Freitag, 6. Juni 2014, 13:39:32 schrieb Darshit Shah:
 I'm facing an issue with the patch I submitted for libpsl and would be
 glad if someone could help me.
 
 The configure.ac file does not work as expected. When libpsl is not
 installed on a system, the LDFLAGS does not contain -lpsl flag, but
 the configure summary shows LIBPSL: Yes.
 
 There is some discrepency in the output that I'd like to fix. The
 build completes successfully because the HAVE_LIBPSL variable isn't
 set, and Wget compiles without libpsl support. This should however
 happen only when --without-libpsl was explicitly specified as a
 configure option.

This should do it:

AC_ARG_WITH(libpsl,
AS_HELP_STRING([--without-libpsl], [disable support for libpsl cookie 
checking.]),
[
  with_libpsl=no
], [
  AC_CHECK_LIB(psl, psl_builtin,
   [with_libpsl=yes; AC_DEFINE([WITH_LIBPSL], [1], [PSL support 
enabled]) LIBS=${LIBS} -lpsl],
   [with_libpsl=no; AC_MSG_WARN(*** libpsl was not found. 
Fallback to Wget builtin cookie checking.)])
])

But I can't compile the gnulib when having WITH_LIBPSL=1.
./configure sets FTELLO_BROKEN_AFTER_SWITCHING_FROM_READ_TO_WRITE to 1 which 
causes undefined fp_
which causes headaches with some SOLARIS code :-(

Autotools is magic for me. I don't know how the above shell code can influence 
gnulib checks.
Maybe anybode knows ?

Tim



Re: [Bug-wget] [bug-wget] Libpsl for cookie domain checking in Wget

2014-06-11 Thread Tim Rühsen
Am Mittwoch, 11. Juni 2014, 13:50:46 schrieb Tim Rühsen:
 Am Freitag, 6. Juni 2014, 13:39:32 schrieb Darshit Shah:
  I'm facing an issue with the patch I submitted for libpsl and would be
  glad if someone could help me.
  
  The configure.ac file does not work as expected. When libpsl is not
  installed on a system, the LDFLAGS does not contain -lpsl flag, but
  the configure summary shows LIBPSL: Yes.
  
  There is some discrepency in the output that I'd like to fix. The
  build completes successfully because the HAVE_LIBPSL variable isn't
  set, and Wget compiles without libpsl support. This should however
  happen only when --without-libpsl was explicitly specified as a
  configure option.
 
 This should do it:
 
 AC_ARG_WITH(libpsl,
 AS_HELP_STRING([--without-libpsl], [disable support for libpsl cookie
 checking.]), [
   with_libpsl=no
 ], [
   AC_CHECK_LIB(psl, psl_builtin,
[with_libpsl=yes; AC_DEFINE([WITH_LIBPSL], [1], [PSL
 support enabled]) LIBS=${LIBS} -lpsl], [with_libpsl=no; AC_MSG_WARN(***
 libpsl was not found. Fallback to Wget builtin cookie checking.)]) ])
 
 But I can't compile the gnulib when having WITH_LIBPSL=1.
 ./configure sets FTELLO_BROKEN_AFTER_SWITCHING_FROM_READ_TO_WRITE to 1 which
 causes undefined fp_ which causes headaches with some SOLARIS code :-(
 
 Autotools is magic for me. I don't know how the above shell code can
 influence gnulib checks. Maybe anybode knows ?

Found the problem. I forgot to say 'ldconfig' after installing libpsl.
Then a ftello test failed to link and thus the described behaviour ;-)

Tim



Re: [Bug-wget] [bug-wget] Libpsl for cookie domain checking in Wget

2014-06-12 Thread Tim Rühsen
Am Mittwoch, 11. Juni 2014, 18:57:13 schrieb Darshit Shah:
 On Wed, Jun 11, 2014 at 5:20 PM, Tim Rühsen tim.rueh...@gmx.de wrote:
  Am Freitag, 6. Juni 2014, 13:39:32 schrieb Darshit Shah:
  I'm facing an issue with the patch I submitted for libpsl and would be
  glad if someone could help me.
  
  The configure.ac file does not work as expected. When libpsl is not
  installed on a system, the LDFLAGS does not contain -lpsl flag, but
  the configure summary shows LIBPSL: Yes.
  
  There is some discrepency in the output that I'd like to fix. The
  build completes successfully because the HAVE_LIBPSL variable isn't
  set, and Wget compiles without libpsl support. This should however
  happen only when --without-libpsl was explicitly specified as a
  configure option.
  
  This should do it:
  
  AC_ARG_WITH(libpsl,
  
  AS_HELP_STRING([--without-libpsl], [disable support for libpsl cookie
  checking.]), [
  
with_libpsl=no
  
  ], [
  
AC_CHECK_LIB(psl, psl_builtin,

 [with_libpsl=yes; AC_DEFINE([WITH_LIBPSL], [1], [PSL
 support enabled]) LIBS=${LIBS} -lpsl],
 [with_libpsl=no; AC_MSG_WARN(*** libpsl was not found.
 Fallback to Wget builtin cookie checking.)]) 
  ])
  
  But I can't compile the gnulib when having WITH_LIBPSL=1.
  ./configure sets FTELLO_BROKEN_AFTER_SWITCHING_FROM_READ_TO_WRITE to 1
  which causes undefined fp_ which causes headaches with some SOLARIS code
  :-(
 
 Hi Tim,
 
 Thanks for the patch. But it didn't help solve the problem. When
 libpsl isn't installed, by default ./configure succeeds but make fails
 because it somehoe detects libpsl even through it isn't installed on
 the system.

I tested it and it works ok.
Perhaps you made a mistake somewhere !?

I attached my current configure.ac - please give it a try. Don't forget 
'autoreconf' after changing configure.ac. After that I do
./configure
make clean
make

If libpsl is still detected, make sure you really uninstalled libpsl.
Also 'grep -i PSL src/config.h config.log', HAVE_LIBPSL should be unset in 
src/config.h.

Further steps include looking into config.log (search for psl) to see what is 
going on.

Shouldn't be too hard to find.

Timdnl Template file for GNU Autoconf
dnl Copyright (C) 1995, 1996, 1997, 2001, 2007, 2008, 2009, 2010, 2011, 2012,
dnl 2013, 2014 Free Software Foundation, Inc.

dnl This program is free software; you can redistribute it and/or modify
dnl it under the terms of the GNU General Public License as published by
dnl the Free Software Foundation; either version 3 of the License, or
dnl (at your option) any later version.

dnl This program is distributed in the hope that it will be useful,
dnl but WITHOUT ANY WARRANTY; without even the implied warranty of
dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
dnl GNU General Public License for more details.

dnl You should have received a copy of the GNU General Public License
dnl along with this program.  If not, see http://www.gnu.org/licenses/.

dnl Additional permission under GNU GPL version 3 section 7

dnl If you modify this program, or any covered work, by linking or
dnl combining it with the OpenSSL project's OpenSSL library (or a
dnl modified version of that library), containing parts covered by the
dnl terms of the OpenSSL or SSLeay licenses, the Free Software Foundation
dnl grants you additional permission to convey the resulting work.
dnl Corresponding Source for a non-source form of such a combination
dnl shall include the source code for the parts of OpenSSL used as well
dnl as that of the covered work.

dnl
dnl Process this file with autoconf to produce a configure script.
dnl

AC_INIT([wget],
m4_esyscmd([build-aux/git-version-gen .tarball-version]),
[bug-wget@gnu.org])
AC_PREREQ(2.61)

dnl
dnl What version of Wget are we building?
dnl
AC_MSG_NOTICE([configuring for GNU Wget $PACKAGE_VERSION])

AC_CONFIG_MACRO_DIR([m4])
AC_CONFIG_AUX_DIR([build-aux])

AC_CONFIG_SRCDIR([src/wget.h])

dnl
dnl Automake setup
dnl
AM_INIT_AUTOMAKE([1.9])

dnl
dnl Get cannonical host
dnl
AC_CANONICAL_HOST
AC_DEFINE_UNQUOTED([OS_TYPE], $host_os,
   [Define to be the name of the operating system.])

dnl
dnl Process features.
dnl

AC_ARG_WITH(libpsl,
AS_HELP_STRING([--without-libpsl], [disable support for libpsl cookie 
checking.]),
[
  with_libpsl=no
], [
  AC_CHECK_LIB(psl, psl_builtin,
   [with_libpsl=yes; AC_DEFINE([WITH_LIBPSL], [1], [PSL support 
enabled]) LIBS=${LIBS} -lpsl], 
   [with_libpsl=no; AC_MSG_WARN(*** libpsl was not found. 
Fallback to Wget builtin cookie checking.)])
])

AC_ARG_WITH(ssl,
[[  --without-ssl   disable SSL autodetection
  --with-ssl={gnutls,openssl} specify the SSL backend.  GNU TLS is the 
default.]])

AC_ARG_WITH(zlib,
[[  --without-zlib  disable zlib ]])

AC_ARG_ENABLE(opie

Re: [Bug-wget] [bug-wget] Libpsl for cookie domain checking in Wget

2014-06-12 Thread Tim Rühsen
Am Donnerstag, 12. Juni 2014, 07:16:13 schrieb Darshit Shah:
 Yes, the configure statements given by Tim work. I found out that the issue
 on machine was caching of configure values. Deleting the configure cache
 fixed the issue.
 
 I also agree with Giuseppe's point about not using the autoconf variables.
 Let's fix the rest of them too. Tim's patch however seems to add -lpsl
 twice for me. Removing the line that explicitly adds it to LIBS does the
 trick for me.

Your configure.ac adds -lpsl later again. Did you use/check/diff the 
configure.ac 
that i sent you ? I removed the second AC_CHECK_LIB... 

Tim



Re: [Bug-wget] [bug-wget] Libpsl for cookie domain checking in Wget

2014-06-12 Thread Tim Rühsen
Am Donnerstag, 12. Juni 2014, 13:24:02 schrieb Giuseppe Scrivano:
 Darshit Shah dar...@gmail.com writes:
  On Wed, Jun 11, 2014 at 5:20 PM, Tim Rühsen tim.rueh...@gmx.de wrote:
  Am Freitag, 6. Juni 2014, 13:39:32 schrieb Darshit Shah:
  I'm facing an issue with the patch I submitted for libpsl and would be
  glad if someone could help me.
  
  The configure.ac file does not work as expected. When libpsl is not
  installed on a system, the LDFLAGS does not contain -lpsl flag, but
  the configure summary shows LIBPSL: Yes.
  
  There is some discrepency in the output that I'd like to fix. The
  build completes successfully because the HAVE_LIBPSL variable isn't
  set, and Wget compiles without libpsl support. This should however
  happen only when --without-libpsl was explicitly specified as a
  configure option.
  
  This should do it:
  
  AC_ARG_WITH(libpsl,
  
  AS_HELP_STRING([--without-libpsl], [disable support for libpsl cookie
  checking.]), [
  
with_libpsl=no
  
  ], [
  
AC_CHECK_LIB(psl, psl_builtin,

 [with_libpsl=yes; AC_DEFINE([WITH_LIBPSL], [1], [PSL
 support enabled]) LIBS=${LIBS} -lpsl],
 [with_libpsl=no; AC_MSG_WARN(*** libpsl was not
 found. Fallback to Wget builtin cookie checking.)])   

  ])
 
 I've not tested it but I think this version should fix the problem we
 had before.  Just one observation, can we use a different name instead
 of with_libpsl?  AFAICS, it is used only to display a message at the
 end of the configure script, so I would prefer we don't mess with
 variables set already by autoconf.
 We will then need to set it to no before the AC_CHECK_LIB is used.

Just read that today AC_SEARCH_LIBS should be used instead of AC_CHECK_LIBS.

That would be something like:

AC_ARG_WITH(libpsl,
AS_HELP_STRING([--without-libpsl], [disable support for libpsl cookie 
checking.]),
[],
[
  AC_SEARCH_LIBS(psl_builtin, psl,
 [AC_DEFINE([WITH_LIBPSL], [1], [PSL support enabled])],
 [AC_MSG_WARN(*** libpsl was not found. Fallback to Wget 
builtin cookie checking.)])
])
AS_IF([test x$ac_cv_search_psl_builtin != x-lpsl], [ ENABLE_PSL=no ], [ 
ENABLE_PSL=yes ])

...
PSL:   $ENABLE_PSL
...

Tim




Re: [Bug-wget] [PATCH] Allow to redefine ciphers list for OpenSSL

2014-07-10 Thread Tim Rühsen
Am Dienstag, 8. Juli 2014, 16:57:35 schrieb Giuseppe Scrivano:
 Tomas Hozza thoz...@gnu.org writes:
  What do you think about extending --secure-protocol and having a runtime
  option instead of a compile time option ? Users could set the system wide
  default value in /etc/wgetrc and people are able to override it through
  ~/.wgetrc or --secure-protocol.
  
  Hi Tim.
  
  I'm afraid this is not suitable for us. We need to be able to define the
  policy somewhere in /etc, where the user is not able to change it (only
  the system administrator).
  
  Also the main intention to have a single place to set the policy for all
  system components, therefore wgetrc is not the right place for us.
  
  Regards,
 
 how would the policy defined in /etc be used by wget?  Is wget going to
 be recompiled if the policy is changed by root?

Also there is still Ángel's remark: your change only applies to --secure-
protocol=PFS. But you also answered to my posting that user should not be able 
to change it... but they can by using e.g. --secure-protocol=TLSv1 or by doing 
settings in ~/.wgetrc.

Maybe you could explain a bit more detailed what you want to do and what you 
expect Wget to do in a Redhat compilation. We really want to help you out.

Tim




Re: [Bug-wget] [Bug-Wget] Misc. patches

2014-07-19 Thread Tim Rühsen
ACK from here.

And please also amend 
return true ? (is_acceptable == 1) : false;
to
return is_acceptable == 1;

Regards, Tim

Am Samstag, 19. Juli 2014, 22:05:26 schrieb Darshit Shah:
 Does anyone ack this patch? It's a memory leak that I would like to fix.
 
 I'll work on Tim's suggestions next.
 
 On Sat, Jul 5, 2014 at 4:38 PM, Darshit Shah dar...@gmail.com wrote:
  I just pushed a slightly amended patch. However, here is what I propose:
  
  diff --git a/src/cookies.c b/src/cookies.c
  index 76301ac..3139671 100644
  --- a/src/cookies.c
  +++ b/src/cookies.c
  @@ -549,6 +549,9 @@ check_domain_match (const char *cookie_domain,
  const char *host)
  
 return true ? (is_acceptable == 1) : false;
   
   no_psl:
  +  /* Cleanup the PSL pointers first */
  +  xfree (cookie_domain_lower);
  +  xfree (host_lower);
  
   #endif
   
 /* For efficiency make some elementary checks first */
  
  The idea is that we add two new xfree calls instead of pushing the
  originals to afer the no_psl label since we return form the function
  *before* the label is encounterd when psl checks are successful.
  
  There will not be any double frees of these pointers either since that
  region of the code is only executed when psl fails and hence the xfree
  statements weren't called.
  
  On Sat, Jul 5, 2014 at 4:19 PM, Giuseppe Scrivano gscriv...@gnu.org 
wrote:
  Darshit Shah dar...@gmail.com writes:
   static bool
   check_domain_match (const char *cookie_domain, const char *host)
  
  @@ -509,6 +519,7 @@ check_domain_match (const char *cookie_domain,
  const char *host) 
   #ifdef HAVE_LIBPSL
   
 DEBUGP ((cdm: 1));
  
  +  char * cookie_domain_lower, * host_lower;
  
  please initialize them to NULL and format like char
  *cookie_domain_lower, *host_lower (no space between * and the variable
  name), otherwise...
  
 const psl_ctx_t *psl;
 int is_acceptable;
  
  @@ -519,7 +530,18 @@ check_domain_match (const char *cookie_domain,
  const char *host) 
 goto no_psl;
   
   }
  
  -  is_acceptable = psl_is_cookie_domain_acceptable (psl, host,
  cookie_domain); +  if (psl_str_to_utf8lower (cookie_domain, NULL,
  NULL, cookie_domain_lower) != PSL_SUCCESS || + 
  psl_str_to_utf8lower (host, NULL, NULL, host_lower) != 
PSL_SUCCESS) 
  ...if the first psl_str_to_utf8lower fails then host_lower keeps
  some bogus value...
  
  +{
  +DEBUGP ((libpsl unable to parse domain name. 
  + Falling back to simple heuristics.\n));
  +goto no_psl;
  +}
  +
  +  is_acceptable = psl_is_cookie_domain_acceptable (psl, host_lower,
  cookie_domain_lower); +  xfree (cookie_domain_lower);
  +  xfree (host_lower);
  
  ...and *boom* here.
  
  Aah! I somehow managed not to get any booms despite having a test
  that saw psl_str_to_utf8lower() fail. However, your comment is correct
  and I'll fix that. The general idea was that if the function fails, it
  will fail on both the calls
  
  I somehow misread the patch and the position of the no_psl label.  We
  should move the two xfree in the cleanup block, after no_psl, to avoid
  a potential memory leak.
  
  Regards,
  Giuseppe
  
  --
  Thanking You,
  Darshit Shah




Re: [Bug-wget] Script providing SOCKS proxy support

2014-07-19 Thread Tim Rühsen
Am Freitag, 18. Juli 2014, 23:58:58 schrieb Ángel González:
 I have written a wrapping script for wget that -using tsocks- makes it
 connect through a socks proxy if the environment variable socks_proxy
 is set.
 
 I'm sharing it here as it may be of interest for a wider audience (or
 should it
 be included on git?)

Thanks for sharing (I guess I have to polish my bash knowledge ;-).

Could you explicitely name a license for your code (and maybe put it in the 
top comment) ?
That would clarify under what circumstances the code can be used.

Regards, Tim




Re: [Bug-wget] [Bug-Wget] Misc. patches

2014-07-20 Thread Tim Rühsen
Am Montag, 21. Juli 2014, 00:58:49 schrieb Darshit Shah:
 On Mon, Jul 7, 2014 at 8:14 PM, Tim Ruehsen tim.rueh...@gmx.de wrote:
  One more comment / idea.
  
  The 'cookie_domain' comes from a HTTP Set-Cookie repsonse header and thus
  is (must be) toASCII() encoded (=puncode). Of course this has to be
  checked when normalizing the incoming cookie data. A cookie comain having
  non-ascii characters should simply be dropped.
  
  The whole check only works when 'host' is also in toASCII() (punycode)
  form.
  
  Assuming this, psl_str_to_utf8lower() just reduces to a ASCII lowercase
  converter.
  
  If Wget would convert any domain name input to punycode + lowercase, many
  conversions would fall away and case-function would not be needed (e.g.
  calling strcmp instead of strcasecmp, the need to call
  psl_str_to_utf8lower() would fall away, etc.).
  
  What do you think ?
 
 Sounds like an interesting idea to me. Although, how do you suggest we
 go about converting the domain names to lowercase?
 I'm not sure about this, so I confirm first. After running the input
 domain names through toASCII(), can we simply pass the string to
 tolower() to get the lowercase version?

That depends on the library you use.

libidn's toASCII() has a built-in lowercase conversion. So the input case does 
not matter, the output is always lowercase ASCII.

Using libidn2, you have to convert to lowercase first yourself (e.g. using 
libunistring). The output is of course lowercase ASCII.

Using libicu, you have to convert to lowercase first yourself (but libicu is 
able to do that). The output is of course lowercase ASCII.


What I thought of (what I did in Mget), 'normalize' every domain name before 
further processing/comparing. 'normalizing' means trimming, percent-decoding, 
charset transcoding to UTF-8, toASCII() conversion (with or without prior 
lowercasing, depending on the IDN library used).

Having that, Wget's code just needs strcmp() to compare domains and
$ wget übel.de Übel.de xn--bel-goa.de 
should reduce to a download of a single file (xn--bel-goa.de/index.html)
(but maybe it is Wget's policy to explictely download every URL given on the 
command line, even if it is always the same !?)

There is domain name input from the command line (URL's and a few options like 
-D/--domains), from local files (-i/--input-file) and from remote files.

But Darshit, maybe this should have low priority. It is more a kind of 'code 
polishing'. I am looking forward to start a Wget version based on a libwget in 
the next 6-12 months. Most of the code is already working in the Mget project, 
but everything needs polishing (e.g. APi docs and more of Wget functionality, 
-k/convert-links implemented last week ;-) And than the day comes to merge 
Wget and Mget... if that finds any friends ;-)

 
  Tim
  
  On Monday 07 July 2014 17:08:48 Darshit Shah wrote:
  +  if (psl_str_to_utf8lower (cookie_domain, NULL,
  NULL,cookie_domain_lower) 
  == PSL_SUCCESS 
  
  +  psl_str_to_utf8lower (host, NULL, NULL, host_lower) ==
  PSL_SUCCESS)
  +{
  +  is_acceptable = psl_is_cookie_domain_acceptable (psl,
  host_lower, cookie_domain_lower);
  +}
  +  else
  +{
  +DEBUGP ((libpsl unable to parse domain name. 
  + Falling back to simple heuristics.\n));
  +goto no_psl;
  +}




Re: [Bug-wget] [Bug-Wget] Misc. patches

2014-07-22 Thread Tim Rühsen
Am Montag, 21. Juli 2014, 15:35:10 schrieb Giuseppe Scrivano:
 Darshit Shah dar...@gmail.com writes:
  From a44841cbe2abe712de84d7413c31fc14b44225a7 Mon Sep 17 00:00:00 2001
  From: Darshit Shah dar...@gmail.com
  Date: Mon, 21 Jul 2014 13:25:54 +0530
  Subject: [PATCH] Fix potential memory leak and libpsl configure
  
  ---
  
   ChangeLog |  4 
   configure.ac  | 16 +---
   src/ChangeLog |  5 +
   src/cookies.c |  5 -
   4 files changed, 22 insertions(+), 8 deletions(-)
 
 the patch seems correct to me.

Yes, looks good to me as well.

Tim




Re: [Bug-wget] [Bug-Wget][BUG] Progress bar does not support multibyte characters

2014-08-30 Thread Tim Rühsen
Am Samstag, 30. August 2014, 09:23:08 schrieb Darshit Shah:
 Earlier this year, I implemented a new, more concise form of the
 progress bar. However, I've just been given a bug report regarding the
 same, which I was unable to fix.
 
 The currently implemented progress bar shows only upto 15 characters
 of the URL. In case of longer URLs, we scroll the filename like a
 ticker. For selecting the 15 characters, wget copies 15 bytes from the
 string into the progress bar. This method fails on URLs containing
 multibyte characters. In this scenario, the progress bar happens to be
 very jittery since the string lengths are very varying.
 
 I am trying to find a solution where we can select a substring which
 is n columns large from a given string of potentially multibyte
 characters. If someone knows how to and could implement a fix, it
 would be truly great!

Hi Darshit,

your are talking about UTF-8 strings ('multibyte' could also be UCS2/4 or 
something else).

UTF-8 strings can't be split at an arbitrary byte, only between so-called code 
points. While you could use a library to handle that, an own function is not 
complicated - UTF-8 is a very straight-forward format. Of course you can find 
tested (GPL) source code if you search, maybe even the GNU Lib contains 
functions for that purpose (at least I wouldn't be suprised).

See http://en.wikipedia.org/wiki/UTF-8 for a description.

Tim




Re: [Bug-wget] [PATCH] Using parallel test harness

2014-09-27 Thread Tim Rühsen
Am Sonntag, 28. September 2014, 01:00:40 schrieb Darshit Shah:
 patch version #3 is attached
 
 Hi Tim,
 
 I just wanted to point out that there is a blocking issue with this patch.
 It eliminates the ability for the user to execute single tests directly. I
 can run `make check` without much issues but I cannot execute a single test
 by directly invoking it. The $src_topdir variable isn't being set correctly
 in that case.
 
 In my opinion, that is an extremely important feature when debugging Wget
 with a single failing test, or when writing a new test.

Hi Darshit,

that's a good point.

A make check -j4 just takes a few seconds on my machine, so I wouldn't mind. 
But yes, there might me much slower machines...

Some tests require to have $top_srcdir in the old test suite as well 
(e.g.Test-proxied-https-auth.px). How do you execute these directly ?

I don't have the sources here, I'll take a closer look at it on Monday.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [Bug-Wget] Issues with Perl-based test suite

2014-09-27 Thread Tim Rühsen
Hi Darshit,

I am answering inline...

Am Sonntag, 28. September 2014, 01:23:08 schrieb Darshit Shah:
 There are a few issues that I've been facing with the old perl based test
 suite that I'd like to highlight and discuss here.
 
 1. The way the test suite has been written, I was unable to hack together a
patch that will allow the various tests to be run under valgrind. I'd
 like a similar functionality like the new Python based test suite where the
 environment variable, VALGRIND_TESTS causes the Wget executable to be
 invoked under valgrind for memrory leak checks.
If anyone could help by contributing a patch / sharing their wisdom on
 how this can be achieved, I'd be very grateful.

This is pretty easy once we have the parallel test suite up and running.
I already did this in configure.ac for the Mget project, so there is not 
secrecy about it. I'll make a patch when the time comes...


 2. Race Conditions: The tets suite seems to have some races somewhere in it.
 Over the last year or so, I've often seen Test-proxied-https.px fail and
 then pass in the second invokation. This seemed like some race, but
 occurred infrequently enopugh to be a pain point. However, Tim's recent
 patch for using the parallel tets harness seems to be causing more tests to
 fail for me. Now I have all the Test-iri* tests also failing very randomly
 and erratically. A second/third/nth invokation of make check will generally
 see them pass successfully. Without Tim's patch, these tests always passed
 without issues. I'm loath to believe that the patch itself is the cause of
 failure. My understanding is that, it is only triggering the issue more
 often leading to a very high rate of false positives.

Believe me, I made many test runs with the parallel test suite with and 
without -jn. The Test-iri* tests *always* succeeded with LC_ALL=C. They 
*never* succeeded when using TESTS_ENVIRONMENT to set a turkish locale.
I started investigating on friday and will continue next week.

But I never saw spurious failures which might indicate races (But I'll have an 
eye on that). How much work is it for you to install a Debian unstable into a 
virtual machine ? Just for comparison with my machine. If these races persist 
in the virtual machine, something would likely be wrong with your hardware 
(RAM ?) If not, something is wrong with your environment (I remember you have 
a cutting edge Arch Linux box).
Shouldn't be too hard to find out...

Again, I urge everyone reading this to share their insights on what /
 where the issue lies and how we can fix it.
 
 In general, I'd like to see the Perl based test suite deprecated in the near
 future. The Python based test suite suffers from none of the above
 mentioned issues and is highly flexible and extendable. The HTTP Server in
 that test suite is feature complete and all tests can now be ported to it.
 The FTP module however does not yet exist and hence, we must keep the perl
 based tests around till that requirement is fulfilled.

I agree in general, but still have some problems with the python test suite 
(e.g. I am not talking python so far, python test suite seems slower than the 
perl test suite, it seems to be more complex to set up a new test). But I 
promise to work into it in the near future to give you a more detailed 
feedback and/or a helping hand.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] SSL Poodle attack

2014-10-15 Thread Tim Rühsen
Am Mittwoch, 15. Oktober 2014, 13:45:18 schrieb Petr Pisar:
 On Wed, Oct 15, 2014 at 11:57:47AM +0200, Tim Rühsen wrote:
  (means, the libraries defaults are used, whatever that is).
  
  Should we break compatibility and map 'auto' to TLSv1 ?
  For the security of the users.
 
 Please no. Instead of changing each TLS program, one should patch only the
 TLS library. This is the reason why why have shared libraries.
 
 So just report the issue to your vendor, he will fix few TSL implementations
 he delivers and all application will get fixed automatically.

Hi Petr,

I tried to make clear that Wget *explicitely* asks for SSLv2 and SSLv3 in the 
default configuration when compiled with OpenSSL. Whatever the OpenSSL library 
vendor is doing... it won't affect Wget in this case. So with your attitude, 
you won't ever be safe ever from Poodle (I guess).

And again my question: should we change the default behaviour of future 
versions of Wget ?
With other words: since we know, the library vendor wouldn't help in the above 
case, what can we do to secure Wget ?

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] please remove SSLv3 from being used until explicitly specified

2014-10-16 Thread Tim Rühsen
Am Donnerstag, 16. Oktober 2014, 14:03:43 schrieb Christoph Anton Mitterer:
 Hi.
 
 Could you please consider to remove SSLv3 (and if not done yet SSLv2 as
 well) from being automatically used, while still leaving users the
 choice to manually enable it (e.g. via --secure-protocol=SSLv2/3).
 
 I think it would be a bad idea to expect that these insecure versions
 are dropped from the SSL backend libs, since they may be retained for
 debugging purposes or people may just use outdated cipher preference
 list.
 
 
 Also, it wget seems to have this --secure-protocol=PFS, which seems a
 bit strange to me, since PFS is not a property of TLS/SSL itself but
 rather the algorithms used.
 Especially, when specifying --secure-protocol=PFS one shouldn't end up
 with SSLv2/3 accidentally :)

Thanks for your input.

We are just discussing that issue (and of course anybody is invited to take 
part here on the list).

While we (developers) could change the code in a few minutes, there might be 
side effects that we (or others) don't want. At least we need an agreement with 
the maintainers on how the optimal strategy looks like.

If you are *really* in a hurry, patch the source yourself.
But I guess the distribution maintainers will provide patches in the next few 
days.

How we change the default behaviour of Wget and maybe what additional features 
we want to give to the users still needs a bit of polishing.

Regards, Tim

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] SSL Poodle attack

2014-10-16 Thread Tim Rühsen
Am Mittwoch, 15. Oktober 2014, 17:26:49 schrieb Daniel Kahn Gillmor:
 On 10/15/2014 03:10 PM, Tim Rühsen wrote:
  I tried to make clear that Wget *explicitely* asks for SSLv2 and SSLv3 in
  the default configuration when compiled with OpenSSL. Whatever the
  OpenSSL library vendor is doing... it won't affect Wget in this case. So
  with your attitude, you won't ever be safe ever from Poodle (I guess).
 
  And again my question: should we change the default behaviour of future
  versions of Wget ?
  With other words: since we know, the library vendor wouldn't help in the
  above case, what can we do to secure Wget ?

 hm, i think Tim is on to something here: by default, wget should use the
 default ciphersuites and protocol versions selected by the TLS library.
  Tweaking the default choices in wget itself tends to make wget more
 brittle than the underlying library.

 The only way that should work to try to improve security in wget via TLS
 implementation preference strings is if the preference string is
 explicitly a minor modification of some system default.  This may or may
 not be possible depending on the preference string syntax of the
 selected TLS implementation.

 (e.g. [for OpenSSL] if the system default is always explicitly
 referenced as DEFAULT and we decide that we never want wget to use RC4,
 then DEFAULT:-RC4 is a sensible approach, because it allows OpenSSL to
 update DEFAULT and wget gains those improvements automatically)

Here is a suggestion for a GnuTLS patch.

I have a look at OpenSSL ciphers and make a similar patch soon.

I also suggested (~1-2 years ago) an option to directly set priority strings /
ciphers for GnuTLS and OpenSSL. In situations like these, such an option would
allow for a quick reaction done by distribution maintainers and users.

What do you think ?

Tim
From 582a887e61cea2dd0f64d462d919f8688fb7862f Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Thu, 16 Oct 2014 20:44:56 +0200
Subject: [PATCH] GnuTLS: do not use SSLv3 except explicitely requested

---
 src/ChangeLog | 4 
 src/gnutls.c  | 5 +++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 1c4e2d5..00d3c10 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,7 @@
+2014-10-16  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* gnutls.c (ssl_connect_wget): do not use SSLv3 except explicitely requested
+
 2014-05-03  Tim Ruehsen  tim.rueh...@gmx.de

 	* retr.c (retrieve_url): fixed memory leak
diff --git a/src/gnutls.c b/src/gnutls.c
index c09b7a2..75627e1 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -436,6 +436,7 @@ ssl_connect_wget (int fd, const char *hostname)
   switch (opt.secure_protocol)
 {
 case secure_protocol_auto:
+  err = gnutls_priority_set_direct (session, NORMAL:%COMPAT:-VERS-SSL3.0, NULL);
   break;
 case secure_protocol_sslv2:
 case secure_protocol_sslv3:
@@ -445,10 +446,10 @@ ssl_connect_wget (int fd, const char *hostname)
   err = gnutls_priority_set_direct (session, NORMAL:-VERS-SSL3.0, NULL);
   break;
 case secure_protocol_pfs:
-  err = gnutls_priority_set_direct (session, PFS, NULL);
+  err = gnutls_priority_set_direct (session, PFS:-VERS-SSL3.0, NULL);
   if (err != GNUTLS_E_SUCCESS)
 /* fallback if PFS is not available */
-err = gnutls_priority_set_direct (session, NORMAL:-RSA, NULL);
+err = gnutls_priority_set_direct (session, NORMAL:-RSA:-VERS-SSL3.0, NULL);
   break;
 default:
   abort ();
--
2.1.1



signature.asc
Description: This is a digitally signed message part.


[Bug-wget] [PATCH] V2 removed 'auto' SSLv3 also from OpenSSL code

2014-10-16 Thread Tim Rühsen
patch V2
- removed SSLv3 from --secure-protocol=auto|pfs (GnuTLS code)
- removed SSLv3 from --secure-protocol=auto (OpenSSL code)
- amended the docs

I am not an OpenSSL expert... please feel free to suggest improvements.

Tim

Am Donnerstag, 16. Oktober 2014, 20:50:32 schrieb Tim Rühsen:
 Am Mittwoch, 15. Oktober 2014, 17:26:49 schrieb Daniel Kahn Gillmor:
  On 10/15/2014 03:10 PM, Tim Rühsen wrote:
   I tried to make clear that Wget *explicitely* asks for SSLv2 and SSLv3
   in
   the default configuration when compiled with OpenSSL. Whatever the
   OpenSSL library vendor is doing... it won't affect Wget in this case. So
   with your attitude, you won't ever be safe ever from Poodle (I guess).
  
   And again my question: should we change the default behaviour of future
   versions of Wget ?
   With other words: since we know, the library vendor wouldn't help in the
   above case, what can we do to secure Wget ?
 
  hm, i think Tim is on to something here: by default, wget should use the
  default ciphersuites and protocol versions selected by the TLS library.
 
   Tweaking the default choices in wget itself tends to make wget more
 
  brittle than the underlying library.
 
  The only way that should work to try to improve security in wget via TLS
  implementation preference strings is if the preference string is
  explicitly a minor modification of some system default.  This may or may
  not be possible depending on the preference string syntax of the
  selected TLS implementation.
 
  (e.g. [for OpenSSL] if the system default is always explicitly
  referenced as DEFAULT and we decide that we never want wget to use RC4,
  then DEFAULT:-RC4 is a sensible approach, because it allows OpenSSL to
  update DEFAULT and wget gains those improvements automatically)

 Here is a suggestion for a GnuTLS patch.

 I have a look at OpenSSL ciphers and make a similar patch soon.

 I also suggested (~1-2 years ago) an option to directly set priority strings
 / ciphers for GnuTLS and OpenSSL. In situations like these, such an option
 would allow for a quick reaction done by distribution maintainers and
 users.

 What do you think ?

 Tim
From bca3e7ea1e430de4fcbc15daad60e8a2953e3a61 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen tim.rueh...@gmx.de
Date: Thu, 16 Oct 2014 20:44:56 +0200
Subject: [PATCH] do not use SSLv3 except explicitely requested

---
 doc/ChangeLog | 4 
 doc/wget.texi | 4 ++--
 src/ChangeLog | 5 +
 src/gnutls.c  | 5 +++--
 src/openssl.c | 4 +---
 5 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/doc/ChangeLog b/doc/ChangeLog
index f055fa5..dd43162 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,7 @@
+2014-10-16  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* wget.texi (Download Options): update --secure-protocol description
+
 2014-08-03  Giuseppe Scrivano  gscriv...@gnu.org

 	* wget.texi (Download Options): Fix texinfo warning.
diff --git a/doc/wget.texi b/doc/wget.texi
index a31eb5e..1e1dd36 100644
--- a/doc/wget.texi
+++ b/doc/wget.texi
@@ -1643,8 +1643,8 @@ without SSL support, none of these options are available.
 Choose the secure protocol to be used.  Legal values are @samp{auto},
 @samp{SSLv2}, @samp{SSLv3}, @samp{TLSv1} and @samp{PFS}.  If @samp{auto}
 is used, the SSL library is given the liberty of choosing the appropriate
-protocol automatically, which is achieved by sending an SSLv2 greeting
-and announcing support for SSLv3 and TLSv1.  This is the default.
+protocol automatically, which is achieved by sending an TLSv1 greeting.
+This is the default.

 Specifying @samp{SSLv2}, @samp{SSLv3}, or @samp{TLSv1} forces the use
 of the corresponding protocol.  This is useful when talking to old and
diff --git a/src/ChangeLog b/src/ChangeLog
index 1c4e2d5..db4cd04 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,8 @@
+2014-10-16  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* gnutls.c (ssl_connect_wget): do not use SSLv3 except explicitely requested
+	* openssl.c (ssl_init): do not use SSLv3 except explicitely requested
+
 2014-05-03  Tim Ruehsen  tim.rueh...@gmx.de

 	* retr.c (retrieve_url): fixed memory leak
diff --git a/src/gnutls.c b/src/gnutls.c
index c09b7a2..75627e1 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -436,6 +436,7 @@ ssl_connect_wget (int fd, const char *hostname)
   switch (opt.secure_protocol)
 {
 case secure_protocol_auto:
+  err = gnutls_priority_set_direct (session, NORMAL:%COMPAT:-VERS-SSL3.0, NULL);
   break;
 case secure_protocol_sslv2:
 case secure_protocol_sslv3:
@@ -445,10 +446,10 @@ ssl_connect_wget (int fd, const char *hostname)
   err = gnutls_priority_set_direct (session, NORMAL:-VERS-SSL3.0, NULL);
   break;
 case secure_protocol_pfs:
-  err = gnutls_priority_set_direct (session, PFS, NULL);
+  err = gnutls_priority_set_direct (session, PFS:-VERS-SSL3.0, NULL);
   if (err != GNUTLS_E_SUCCESS)
 /* fallback if PFS is not available */
-err

Re: [Bug-wget] please remove SSLv3 from being used until explicitly specified

2014-10-17 Thread Tim Rühsen
Am Donnerstag, 16. Oktober 2014, 22:01:35 schrieb Ángel González:
 Ángel González wrote:
  First of all, note that wget doesn't react to a disconnect with a
  downgraded retry thus
  it is mainly not vulnerable to poodle (you could only use
  CVE-2014-3566 against servers
  not supporting TLS).
 
 Note I tested both openssl and gnutls builds. Then I rebuilt 1.15¹ with
 both libraries using
 versions prior to poodle announcement. None of them was affected.
 
 
 ¹ I am having some problem with src/Makefile generation, so I didn't
 test with master, but that
 should be equivalent.

Hi Ángel,

thanks for your testing.

I would like to reproduce it - can you tell me what you did exactly ?

The original paper talks about 'client renegotiation dance'.
What about renegotiation at protocol level ? Isn't it possible that a TLS 
connection goes down to SSLv3 intransparent to the client/server code ?
I am not that deep into the TLS/SSL libraries to answer that question myself 
right now. The paper talks about 'proper protocol version negotiation' - that 
seems to need some clarification.

Tim




Re: [Bug-wget] please remove SSLv3 from being used until explicitly specified

2014-10-17 Thread Tim Rühsen
Am Freitag, 17. Oktober 2014, 18:02:39 schrieb Christoph Anton Mitterer:
 On Thu, 2014-10-16 at 21:34 +0200, Ángel González wrote:
  First of all, note that wget doesn't react to a disconnect with a
  downgraded retry thus
  it is mainly not vulnerable to poodle (you could only use CVE-2014-3566
  against servers
  not supporting TLS).
  
  Then, even in that case, as an attacker won't be able to dynamically
  connect in the
  background to another site, explotaition would be much harder (something
  like a
  recursive download on an attacker-controlled server (such as http) which
  is redirecting
  _some_ requests to the https target). For little gaining, as it's very
  unlikely that such
  wget would hold any secret for that server connection (I think you would
  need to use
  --load-cookies with a file shared with another -sensitive- batch
  processing).
 
 Thanks for trying that out...
 But often when such issues are found, no long afterwords people can
 attack it even more and what seems impossible right now may be possible
 then.
 
 Just look at the whole black magic to defend SSL against all the
 CBC/padding, MtE, lucky13 and further attacks... they fixed it and some
 time later the attacks where improved and the same issues where back.
 
 That's why I think SSLv3 should be no longer used, even if wget isn't
 that strongly exposed to attacks.
 Also one cannot say that people who depend on it wouldn't have had their
 time to move on to TLSv1.x... that SSLv3 will/should be phased out, is
 clear for years.
 So I feel, better proactively disable it (even if not yet necessary) and
 affect those who haven't done their homework, instead of waiting too
 long and let those suffer who did.

Looking at the thread 'SSL Poodle attack'.
So far everybody seem to agree to disable SSLv3 in the default settings.
I already posted a patch for OpenSSL and GnuTLS.

Because 'Poodle' itself does not affect Wget (e.g. you need a Javascript 
enabled client for that and Wget does not have a renegotiate mechanism), we 
are not in a hurry. SSLv3 *will* cease from the Wget defaults in the next 
time, i am sure. Please don't be too concerned.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Bug Track

2014-10-19 Thread Tim Rühsen
Am Samstag, 18. Oktober 2014, 22:40:14 schrieb Tushar:
 Hi,
 
 I am a student who would like to contribute to GNU Project. I'm very
 passionate about GNU organization and would like to dedicate some time
 everyday for GNU. It was mentioned that I have to send an email to this
 address before diving into Bugs. I would look into the bugs and
 hopefully, I would be able to help. I don't have much programming
 experience and I know things will be difficult for me but it is my
 passion for GNU and it's philosophy that drives me.

Hi Tushar,

welcome to Wget. We appreciate any help you can give !

You don't have to be a (experienced) programmer to help out.

I guess, you are already registered at Savannah (https://savannah.gnu.org) !?

We are going to clean the bug list and if you could help, that would be very 
cool.

Just dig into the bug list and - as a first step - try to reproduce bugs with 
the latest development version of Wget (clone from the git repo and compile).

If you can't reproduce it, either create a comment and request more info 
(whatever is needed, e.g. debug out or Wget version) and/or sent me an email 
(maybe PM), so I can view and close the bug.

You can also write me an email whenever you have any questions.
e.g. if you are going to write a new test and you are unsure - just ask me 
(and/or the list if the question is of egenral interest).

Please send patches directly to the list for reviewing / discussion / 
integration.


 
 Thank you.
 

We thank you !

Tim

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Send Content-Length with POST 0 length body

2014-10-19 Thread Tim Rühsen
Am Sonntag, 19. Oktober 2014, 11:27:51 schrieb Matthew Atkinson:
 Hi
 
 I was looking through the list archives to find something that useful I
 could contribute to wget and get familiar with the code, I have attached
 a patch for the following.
 
 Darshit Shah darnir at gmail.com wrote on 2014-09-05 07:31:34 GMT
 
  The Content-Length Header is expected by the server in many requests,
  most prominently in a POST Request. However, currently if a POST
  request is sent without any body, Wget does not send any
  Content-Length headers. Some servers seem to dislike this behaviour
  and respond with a 411 Length Required message.
  Patch Wget to always send a Content-Length header when the Request
  Type is POST.
 
 Hopefully I can find something less trivial to add or fix in the future,
 any suggestions, direction or feedback, please let me know.

Very good ! Thanks for your work.

There two little things - well, just kinda organizational work:

1. Please also extend src/ChangeLog and include it in your patch
2. The maintainers will ask you for a 'git format-patch' output
  If you never worked with that:
  - git commit your change(s)
  - attach the files generated with 'git format-patch -1'

Thank you

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] V2 removed 'auto' SSLv3 also from OpenSSL code

2014-10-19 Thread Tim Rühsen
Am Sonntag, 19. Oktober 2014, 16:07:35 schrieb Giuseppe Scrivano:
 Tim Rühsen tim.rueh...@gmx.de writes:
  patch V2
  
  - removed SSLv3 from --secure-protocol=auto|pfs (GnuTLS code)
  - removed SSLv3 from --secure-protocol=auto (OpenSSL code)
  - amended the docs
  
  I am not an OpenSSL expert... please feel free to suggest improvements.
 
 same here, but it looks like a good idea to disable SSLv3, so feel free
 to push it.
 
 Regards,
 Giuseppe

It just has been pushed.

Tim




Re: [Bug-wget] please remove SSLv3 from being used until explicitly specified

2014-10-19 Thread Tim Rühsen
Am Sonntag, 19. Oktober 2014, 21:11:01 schrieb Ángel González:
 Tim Rühsen wrote:
  Hi Ángel,
  
  thanks for your testing.
  
  I would like to reproduce it - can you tell me what you did exactly ?
 
 I used a simple server that printed the TLS Client Hello and closed the
 connection.
 Browsers automatically retried with lower SSL versions.
 wget aborted with an «Unable to establish SSL connection.» message
 
  The original paper talks about 'client renegotiation dance'.
  What about renegotiation at protocol level ? Isn't it possible that a TLS
  connection goes down to SSLv3 intransparent to the client/server code ?
 
 AFAIK no. That is protected by the HMAC. The problem is the version
 downgrading
 on a network error, which can be inserted by a MiTM (and without
 TLS_FALLBACK_SCSV the server won't be able to that the client downgraded its
 version thinking the server didn't support a greater one).
 
  I am not that deep into the TLS/SSL libraries to answer that question
  myself right now. The paper talks about 'proper protocol version
  negotiation' - that seems to need some clarification.
 
 That's the server replying with a lower protocol version in the same
 connection.
 The downgrade was a hack for broken servers not properly supporting SSL.
 And
 we are paying it now.

Thank you !

Tim




  1   2   3   4   5   6   7   8   9   >