Bug#642563: wget needs many memory for recursive https downloads

2011-10-31 Thread Jörg Sommer
Jim Paris hat am Tue 25. Oct, 14:26 (-0400) geschrieben:
 Noël Köthe wrote:
  Am Freitag, den 23.09.2011, 23:23 +0200 schrieb Jörg Sommer:
  
   downloading a https site with -r or -p makes wget grows up to 500MB and
   more. For version 1.12 this wasn't the case.
   
   % time wget -p -nv https://www.fsf.org
  ...
   wget -p -nv https://www.fsf.org  55,63s usr 3,64s sys 2:44,49 tot 254MB 0 
   77726 pf 345 27781 cs
  ...
   Versions of packages wget depends on:
  ...
   ii  libgnutls262.12.10-2 
  
  The difference between 1.12 and 1.13 is that upstream switched from
  openssl to gnutls. With wget 1.13 and 1.13.4 and libgnutls26 2.12.12 I
  get:
  # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/
  ...
  0.54user 0.05system 0:01.53elapsed 38%CPU (0avgtext+0avgdata 
  77392maxresident)k
  0inputs+0outputs (0major+5474minor)pagefaults 0swaps
  
  With gnutls 2.12.12 and wget 1.13.4 you still have the same high memory
  consumtion for https downloads?
 
 With wget 1.13.4-1 and libgnutls26 2.12.12-1:
 
   # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/
   ...
   11.29user 0.39system 0:12.48elapsed 93%CPU (0avgtext+0avgdata 
 2086656maxresident)k
   0inputs+0outputs (0major+131068minor)pagefaults 0swaps
 
 How many files do you have in /etc/ssl/certs?  That seems to be the
 cause here.

The problem seams to be, that the certificates get loaded for every
connection:

% strace -o /tmp/wget.st -e trace=file =wget -q --spider -r -l 1 
https://fsfe.org/
^C
LC_ALL=C strace -fvttT -o /tmp/wget.st -e trace=file =wget -q --spider -r -l   
107,09s usr 20,87s sys 4:33,34 tot 258MB 157 85855 pf 761641 41557 cs

% grep -o 'open../etc/ssl.*)' /tmp/wget.st |sort |uniq -c |awk '{print $1}' 
|sort -u
37
38

 If I remove all of the individual certificates and keep only the
 bundle:
 
   # cd /etc/ssl/certs
   # ls | wc -l
   474

Me, too.

 Then it's fast again:
 
   # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/
   ...
   0.11user 0.00system 0:00.36elapsed 32%CPU (0avgtext+0avgdata 
 20480maxresident)k
   0inputs+0outputs (0major+1458minor)pagefaults 0swaps

And if you download multiple files, e.g. run wget recursively? Sorry, I
can't try it myself.

I've run valgrind, but had to kill it, because it used too much memory.
But the summary looks like there's a memory leak.

% valgrind =wget -q --spider https://fsfe.org/
==4100== HEAP SUMMARY:
==4100== in use at exit: 46,366,821 bytes in 1,604,336 blocks
==4100==   total heap usage: 16,559,135 allocs, 14,954,799 frees, 1,365,739,117 
bytes allocated
==4100== 
==4100== LEAK SUMMARY:
==4100==definitely lost: 94,924 bytes in 2,151 blocks
==4100==indirectly lost: 37,810,070 bytes in 1,342,897 blocks
 
==4100==  possibly lost: 2,955,933 bytes in 69,663 blocks
==4100==still reachable: 5,505,894 bytes in 189,625 blocks
==4100== suppressed: 0 bytes in 0 blocks
==4100== Rerun with --leak-check=full to see details of leaked memory
==4100== 

Bye, Jörg.
-- 
UNIX is user friendly, it's just picky about who its friends are


signature.asc
Description: Digital signature http://en.wikipedia.org/wiki/OpenPGP


Bug#642563: wget needs many memory for recursive https downloads

2011-10-25 Thread Noël Köthe
Hello Jörg, Stefan and Jim,

Thanks for your bugreport and comments to bugs.debian.org/642563

Am Freitag, den 23.09.2011, 23:23 +0200 schrieb Jörg Sommer:

 downloading a https site with -r or -p makes wget grows up to 500MB and
 more. For version 1.12 this wasn't the case.
 
 % time wget -p -nv https://www.fsf.org
...
 wget -p -nv https://www.fsf.org  55,63s usr 3,64s sys 2:44,49 tot 254MB 0 
 77726 pf 345 27781 cs
...
 Versions of packages wget depends on:
...
 ii  libgnutls262.12.10-2 

The difference between 1.12 and 1.13 is that upstream switched from
openssl to gnutls. With wget 1.13 and 1.13.4 and libgnutls26 2.12.12 I
get:
# LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/
...
0.54user 0.05system 0:01.53elapsed 38%CPU (0avgtext+0avgdata 77392maxresident)k
0inputs+0outputs (0major+5474minor)pagefaults 0swaps

With gnutls 2.12.12 and wget 1.13.4 you still have the same high memory
consumtion for https downloads?

-- 
Noël Köthe noel debian.org
Debian GNU/Linux, www.debian.org


signature.asc
Description: This is a digitally signed message part


Bug#642563: wget needs many memory for recursive https downloads

2011-10-25 Thread Jim Paris
Noël Köthe wrote:
 Hello Jörg, Stefan and Jim,
 
 Thanks for your bugreport and comments to bugs.debian.org/642563
 
 Am Freitag, den 23.09.2011, 23:23 +0200 schrieb Jörg Sommer:
 
  downloading a https site with -r or -p makes wget grows up to 500MB and
  more. For version 1.12 this wasn't the case.
  
  % time wget -p -nv https://www.fsf.org
 ...
  wget -p -nv https://www.fsf.org  55,63s usr 3,64s sys 2:44,49 tot 254MB 0 
  77726 pf 345 27781 cs
 ...
  Versions of packages wget depends on:
 ...
  ii  libgnutls262.12.10-2 
 
 The difference between 1.12 and 1.13 is that upstream switched from
 openssl to gnutls. With wget 1.13 and 1.13.4 and libgnutls26 2.12.12 I
 get:
 # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/
 ...
 0.54user 0.05system 0:01.53elapsed 38%CPU (0avgtext+0avgdata 
 77392maxresident)k
 0inputs+0outputs (0major+5474minor)pagefaults 0swaps
 
 With gnutls 2.12.12 and wget 1.13.4 you still have the same high memory
 consumtion for https downloads?

With wget 1.13.4-1 and libgnutls26 2.12.12-1:

  # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/
  ...
  11.29user 0.39system 0:12.48elapsed 93%CPU (0avgtext+0avgdata 
2086656maxresident)k
  0inputs+0outputs (0major+131068minor)pagefaults 0swaps

How many files do you have in /etc/ssl/certs?  That seems to be the
cause here.  If I remove all of the individual certificates and keep
only the bundle:

  # cd /etc/ssl/certs
  # ls | wc -l
  474
  # mkdir bad/
  # mv *.? *.pem bad/
  # ls -F
  3bab3a36@  bad/  ca-certificates.crt  java/

Then it's fast again:

  # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/
  ...
  0.11user 0.00system 0:00.36elapsed 32%CPU (0avgtext+0avgdata 
20480maxresident)k
  0inputs+0outputs (0major+1458minor)pagefaults 0swaps

-jim



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#642563: wget needs many memory for recursive https downloads

2011-10-25 Thread Jörg Sommer
Hello Noël,

Noël Köthe hat am Tue 25. Oct, 12:40 (+0200) geschrieben:
 Am Freitag, den 23.09.2011, 23:23 +0200 schrieb Jörg Sommer:
 
  downloading a https site with -r or -p makes wget grows up to 500MB and
  more. For version 1.12 this wasn't the case.
  
  % time wget -p -nv https://www.fsf.org
 ...
  wget -p -nv https://www.fsf.org  55,63s usr 3,64s sys 2:44,49 tot 254MB 0 
  77726 pf 345 27781 cs
 ...
  Versions of packages wget depends on:
 ...
  ii  libgnutls262.12.10-2 
 
 The difference between 1.12 and 1.13 is that upstream switched from
 openssl to gnutls. With wget 1.13 and 1.13.4 and libgnutls26 2.12.12 I
 get:
 # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/
 ...
 0.54user 0.05system 0:01.53elapsed 38%CPU (0avgtext+0avgdata 
 77392maxresident)k
 0inputs+0outputs (0major+5474minor)pagefaults 0swaps
 
 With gnutls 2.12.12 and wget 1.13.4 you still have the same high memory
 consumtion for https downloads?

Yes, I still have this massy memory consumption.

% LCC dpkg -l wget libgnutls26
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name   VersionDescription
+++-==-==-
ii  libgnutls262.12.12-1  GNU TLS library - runtime 
library
ii  wget   1.13.4-1   retrieves files from the web

% time wget -p -nv https://www.fsf.org
…
FINISHED --2011-10-25 20:44:01--
Total wall clock time: 5m 2s
Downloaded: 15 files, 159K in 13s (11,9 KB/s)
noglob wget -p -nv https://www.fsf.org  97,71s usr 5,19s sys 5:02,80 tot 257MB 
26 78488 pf 675 42879 cs

Still more than 200MB for a single page.

Bye, Jörg.
-- 
 Ich kenn mich mit OpenBSD kaum aus, was sind denn da so die
 Vorteile gegenueber Linux und iptables?
Der Fuchsschwanzeffekt ist größer. :-
Message-ID: slrnb11064.54g.hsch...@humbert.ddns.org


signature.asc
Description: Digital signature http://en.wikipedia.org/wiki/OpenPGP


Bug#642563: wget needs many memory for recursive https downloads

2011-09-28 Thread Stefan Bühler

Same here.


-- System Information:
Debian Release: wheezy/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (101, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 3.0.0-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages wget depends on:
ii  dpkg   1.16.0.3
ii  install-info   4.13a.dfsg.1-8
ii  libc6  2.13-21
ii  libgcrypt111.5.0-3
ii  libgnutls262.12.10-2
ii  libgpg-error0  1.10-1
ii  libidn11   1.22-3
ii  zlib1g 1:1.2.3.4.dfsg-3



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#642563: wget needs many memory for recursive https downloads

2011-09-23 Thread Jörg Sommer
Package: wget
Version: 1.13-1
Severity: normal

Hi,

downloading a https site with -r or -p makes wget grows up to 500MB and
more. For version 1.12 this wasn't the case.

% time wget -p -nv https://www.fsf.org
2011-09-23 23:20:11 URL:https://www.fsf.org/ [27886/27886] - 
www.fsf.org/index.html [1]
2011-09-23 23:20:15 URL:https://www.fsf.org/robots.txt [89/89] - 
www.fsf.org/robots.txt [1]
2011-09-23 23:20:24 URL:https://static.fsf.org/nosvn/plone3/css/print.css 
[2216/2216] - www.fsf.org/static/nosvn/plone3/css/print.css [1]
…
FINISHED --2011-09-23 23:22:48--
Total wall clock time: 2m 44s
Downloaded: 15 files, 166K in 9,9s (16,7 KB/s)
wget -p -nv https://www.fsf.org  55,63s usr 3,64s sys 2:44,49 tot 254MB 0 77726 
pf 345 27781 cs
  ^

Bye, Jörg.

-- System Information:
Debian Release: unstable/experimental
  APT prefers unstable
  APT policy: (900, 'unstable'), (700, 'experimental')
Architecture: powerpc (ppc)

Kernel: Linux 3.1.0-rc5.ledtest-00231-ged2888e-dirty
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages wget depends on:
ii  dpkg   1.16.1
ii  install-info   4.13a.dfsg.1-8
ii  libc6  2.13-21   
ii  libgcrypt111.5.0-3   
ii  libgnutls262.12.10-2 
ii  libgpg-error0  1.10-1
ii  libidn11   1.22-3
ii  zlib1g 1:1.2.5.dfsg-1

wget recommends no packages.

wget suggests no packages.

-- no debconf information


signature.asc
Description: Digital signature http://en.wikipedia.org/wiki/OpenPGP