Bug#642563: wget needs many memory for recursive https downloads
Jim Paris hat am Tue 25. Oct, 14:26 (-0400) geschrieben: Noël Köthe wrote: Am Freitag, den 23.09.2011, 23:23 +0200 schrieb Jörg Sommer: downloading a https site with -r or -p makes wget grows up to 500MB and more. For version 1.12 this wasn't the case. % time wget -p -nv https://www.fsf.org ... wget -p -nv https://www.fsf.org 55,63s usr 3,64s sys 2:44,49 tot 254MB 0 77726 pf 345 27781 cs ... Versions of packages wget depends on: ... ii libgnutls262.12.10-2 The difference between 1.12 and 1.13 is that upstream switched from openssl to gnutls. With wget 1.13 and 1.13.4 and libgnutls26 2.12.12 I get: # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/ ... 0.54user 0.05system 0:01.53elapsed 38%CPU (0avgtext+0avgdata 77392maxresident)k 0inputs+0outputs (0major+5474minor)pagefaults 0swaps With gnutls 2.12.12 and wget 1.13.4 you still have the same high memory consumtion for https downloads? With wget 1.13.4-1 and libgnutls26 2.12.12-1: # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/ ... 11.29user 0.39system 0:12.48elapsed 93%CPU (0avgtext+0avgdata 2086656maxresident)k 0inputs+0outputs (0major+131068minor)pagefaults 0swaps How many files do you have in /etc/ssl/certs? That seems to be the cause here. The problem seams to be, that the certificates get loaded for every connection: % strace -o /tmp/wget.st -e trace=file =wget -q --spider -r -l 1 https://fsfe.org/ ^C LC_ALL=C strace -fvttT -o /tmp/wget.st -e trace=file =wget -q --spider -r -l 107,09s usr 20,87s sys 4:33,34 tot 258MB 157 85855 pf 761641 41557 cs % grep -o 'open../etc/ssl.*)' /tmp/wget.st |sort |uniq -c |awk '{print $1}' |sort -u 37 38 If I remove all of the individual certificates and keep only the bundle: # cd /etc/ssl/certs # ls | wc -l 474 Me, too. Then it's fast again: # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/ ... 0.11user 0.00system 0:00.36elapsed 32%CPU (0avgtext+0avgdata 20480maxresident)k 0inputs+0outputs (0major+1458minor)pagefaults 0swaps And if you download multiple files, e.g. run wget recursively? Sorry, I can't try it myself. I've run valgrind, but had to kill it, because it used too much memory. But the summary looks like there's a memory leak. % valgrind =wget -q --spider https://fsfe.org/ ==4100== HEAP SUMMARY: ==4100== in use at exit: 46,366,821 bytes in 1,604,336 blocks ==4100== total heap usage: 16,559,135 allocs, 14,954,799 frees, 1,365,739,117 bytes allocated ==4100== ==4100== LEAK SUMMARY: ==4100==definitely lost: 94,924 bytes in 2,151 blocks ==4100==indirectly lost: 37,810,070 bytes in 1,342,897 blocks ==4100== possibly lost: 2,955,933 bytes in 69,663 blocks ==4100==still reachable: 5,505,894 bytes in 189,625 blocks ==4100== suppressed: 0 bytes in 0 blocks ==4100== Rerun with --leak-check=full to see details of leaked memory ==4100== Bye, Jörg. -- UNIX is user friendly, it's just picky about who its friends are signature.asc Description: Digital signature http://en.wikipedia.org/wiki/OpenPGP
Bug#642563: wget needs many memory for recursive https downloads
Hello Jörg, Stefan and Jim, Thanks for your bugreport and comments to bugs.debian.org/642563 Am Freitag, den 23.09.2011, 23:23 +0200 schrieb Jörg Sommer: downloading a https site with -r or -p makes wget grows up to 500MB and more. For version 1.12 this wasn't the case. % time wget -p -nv https://www.fsf.org ... wget -p -nv https://www.fsf.org 55,63s usr 3,64s sys 2:44,49 tot 254MB 0 77726 pf 345 27781 cs ... Versions of packages wget depends on: ... ii libgnutls262.12.10-2 The difference between 1.12 and 1.13 is that upstream switched from openssl to gnutls. With wget 1.13 and 1.13.4 and libgnutls26 2.12.12 I get: # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/ ... 0.54user 0.05system 0:01.53elapsed 38%CPU (0avgtext+0avgdata 77392maxresident)k 0inputs+0outputs (0major+5474minor)pagefaults 0swaps With gnutls 2.12.12 and wget 1.13.4 you still have the same high memory consumtion for https downloads? -- Noël Köthe noel debian.org Debian GNU/Linux, www.debian.org signature.asc Description: This is a digitally signed message part
Bug#642563: wget needs many memory for recursive https downloads
Noël Köthe wrote: Hello Jörg, Stefan and Jim, Thanks for your bugreport and comments to bugs.debian.org/642563 Am Freitag, den 23.09.2011, 23:23 +0200 schrieb Jörg Sommer: downloading a https site with -r or -p makes wget grows up to 500MB and more. For version 1.12 this wasn't the case. % time wget -p -nv https://www.fsf.org ... wget -p -nv https://www.fsf.org 55,63s usr 3,64s sys 2:44,49 tot 254MB 0 77726 pf 345 27781 cs ... Versions of packages wget depends on: ... ii libgnutls262.12.10-2 The difference between 1.12 and 1.13 is that upstream switched from openssl to gnutls. With wget 1.13 and 1.13.4 and libgnutls26 2.12.12 I get: # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/ ... 0.54user 0.05system 0:01.53elapsed 38%CPU (0avgtext+0avgdata 77392maxresident)k 0inputs+0outputs (0major+5474minor)pagefaults 0swaps With gnutls 2.12.12 and wget 1.13.4 you still have the same high memory consumtion for https downloads? With wget 1.13.4-1 and libgnutls26 2.12.12-1: # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/ ... 11.29user 0.39system 0:12.48elapsed 93%CPU (0avgtext+0avgdata 2086656maxresident)k 0inputs+0outputs (0major+131068minor)pagefaults 0swaps How many files do you have in /etc/ssl/certs? That seems to be the cause here. If I remove all of the individual certificates and keep only the bundle: # cd /etc/ssl/certs # ls | wc -l 474 # mkdir bad/ # mv *.? *.pem bad/ # ls -F 3bab3a36@ bad/ ca-certificates.crt java/ Then it's fast again: # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/ ... 0.11user 0.00system 0:00.36elapsed 32%CPU (0avgtext+0avgdata 20480maxresident)k 0inputs+0outputs (0major+1458minor)pagefaults 0swaps -jim -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#642563: wget needs many memory for recursive https downloads
Hello Noël, Noël Köthe hat am Tue 25. Oct, 12:40 (+0200) geschrieben: Am Freitag, den 23.09.2011, 23:23 +0200 schrieb Jörg Sommer: downloading a https site with -r or -p makes wget grows up to 500MB and more. For version 1.12 this wasn't the case. % time wget -p -nv https://www.fsf.org ... wget -p -nv https://www.fsf.org 55,63s usr 3,64s sys 2:44,49 tot 254MB 0 77726 pf 345 27781 cs ... Versions of packages wget depends on: ... ii libgnutls262.12.10-2 The difference between 1.12 and 1.13 is that upstream switched from openssl to gnutls. With wget 1.13 and 1.13.4 and libgnutls26 2.12.12 I get: # LC_ALL=C /usr/bin/time wget --debug -O /dev/null https://www.google.com/ ... 0.54user 0.05system 0:01.53elapsed 38%CPU (0avgtext+0avgdata 77392maxresident)k 0inputs+0outputs (0major+5474minor)pagefaults 0swaps With gnutls 2.12.12 and wget 1.13.4 you still have the same high memory consumtion for https downloads? Yes, I still have this massy memory consumption. % LCC dpkg -l wget libgnutls26 Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name VersionDescription +++-==-==- ii libgnutls262.12.12-1 GNU TLS library - runtime library ii wget 1.13.4-1 retrieves files from the web % time wget -p -nv https://www.fsf.org … FINISHED --2011-10-25 20:44:01-- Total wall clock time: 5m 2s Downloaded: 15 files, 159K in 13s (11,9 KB/s) noglob wget -p -nv https://www.fsf.org 97,71s usr 5,19s sys 5:02,80 tot 257MB 26 78488 pf 675 42879 cs Still more than 200MB for a single page. Bye, Jörg. -- Ich kenn mich mit OpenBSD kaum aus, was sind denn da so die Vorteile gegenueber Linux und iptables? Der Fuchsschwanzeffekt ist größer. :- Message-ID: slrnb11064.54g.hsch...@humbert.ddns.org signature.asc Description: Digital signature http://en.wikipedia.org/wiki/OpenPGP
Bug#642563: wget needs many memory for recursive https downloads
Same here. -- System Information: Debian Release: wheezy/sid APT prefers testing APT policy: (990, 'testing'), (500, 'unstable'), (101, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 3.0.0-1-amd64 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages wget depends on: ii dpkg 1.16.0.3 ii install-info 4.13a.dfsg.1-8 ii libc6 2.13-21 ii libgcrypt111.5.0-3 ii libgnutls262.12.10-2 ii libgpg-error0 1.10-1 ii libidn11 1.22-3 ii zlib1g 1:1.2.3.4.dfsg-3 -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#642563: wget needs many memory for recursive https downloads
Package: wget Version: 1.13-1 Severity: normal Hi, downloading a https site with -r or -p makes wget grows up to 500MB and more. For version 1.12 this wasn't the case. % time wget -p -nv https://www.fsf.org 2011-09-23 23:20:11 URL:https://www.fsf.org/ [27886/27886] - www.fsf.org/index.html [1] 2011-09-23 23:20:15 URL:https://www.fsf.org/robots.txt [89/89] - www.fsf.org/robots.txt [1] 2011-09-23 23:20:24 URL:https://static.fsf.org/nosvn/plone3/css/print.css [2216/2216] - www.fsf.org/static/nosvn/plone3/css/print.css [1] … FINISHED --2011-09-23 23:22:48-- Total wall clock time: 2m 44s Downloaded: 15 files, 166K in 9,9s (16,7 KB/s) wget -p -nv https://www.fsf.org 55,63s usr 3,64s sys 2:44,49 tot 254MB 0 77726 pf 345 27781 cs ^ Bye, Jörg. -- System Information: Debian Release: unstable/experimental APT prefers unstable APT policy: (900, 'unstable'), (700, 'experimental') Architecture: powerpc (ppc) Kernel: Linux 3.1.0-rc5.ledtest-00231-ged2888e-dirty Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages wget depends on: ii dpkg 1.16.1 ii install-info 4.13a.dfsg.1-8 ii libc6 2.13-21 ii libgcrypt111.5.0-3 ii libgnutls262.12.10-2 ii libgpg-error0 1.10-1 ii libidn11 1.22-3 ii zlib1g 1:1.2.5.dfsg-1 wget recommends no packages. wget suggests no packages. -- no debconf information signature.asc Description: Digital signature http://en.wikipedia.org/wiki/OpenPGP