Package: wget Version: 1.18-5+deb9u2 Severity: important Tags: security Tags: patch Fixed: 1.20.1-1
Dear maintainer, the 09-stretch version of wget --convert-links fails if when encountering an embedded image and trying to parse this as a link. How to repeat Save the following file as "index.html" in the webroot of a web server under your control. In the given example it's "localhost". ==================================================================== <html> <head> <title>title</title> </head> <body> <img srcset="data:image/gif;base64,AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"/> </body> </html> ==================================================================== Run "wget --convert-links http://localhost/index.html --debug" Observed: | (...) | Length: 161 [text/html] | Saving to: ‘index.html’ | | index.html 100%[=========================================>] 161 --.-KB/s in 0s | | 2019-03-12 00:00:00 (12,1 MB/s) - ‘index.html’ saved [161/161] | | Scanning index.html (from http://localhost/index.html) | Loaded index.html (size 161). | URI encoding = ‘UTF-8’ | index.html: merge(‘http://localhost/index.html’, ‘data:image/gif;base64,AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’) -> data:image/gif;base64,AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | index.html: merged link "data:image/gif;base64,AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" doesn't parse. | Segmentation fault Expected: | (...) | Length: 161 [text/html] | Saving to: ‘index.html’ | | index.html 100%[=========================================>] 161 --.-KB/s in 0s | | 2019-03-12 00:00:00 (16,4 MB/s) - ‘index.html’ saved [161/161] | | Scanning index.html (from http://localhost/index.html) | Loaded index.html (size 161). | URI encoding = ‘UTF-8’ | index.html: merge(‘http://localhost/index.html’, ‘data:image/gif;base64,AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’) -> data:image/gif;base64,AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | index.html: merged link "data:image/gif;base64,AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" doesn't parse. | no-follow in index.html: 0 | Converting links in index.html... nothing to do. | Converted links in 1 files in 0,001 seconds. This was fixed in 10-buster/sid (1.20), and the change is fairly simple, see attached patch. Please apply when convenient. The 08-jessie version is not affected. Cheers, Christoph -- System Information: Debian Release: 9.8 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'proposed-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.26 (SMP w/4 CPU cores) Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8), LANGUAGE=de_DE.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: unable to detect Versions of packages wget depends on: ii libc6 2.24-11+deb9u4 ii libgnutls30 3.5.8-5+deb9u4 ii libidn11 1.33-1 ii libnettle6 3.3-1+b2 ii libpcre3 2:8.39-3 ii libpsl5 0.17.0-3 ii libuuid1 2.29.2-1+deb9u1 ii zlib1g 1:1.2.8.dfsg-5 Versions of packages wget recommends: ii ca-certificates 20161130+nmu1+deb9u1 wget suggests no packages. -- no debconf information
--- a/src/html-url.c +++ b/src/html-url.c @@ -729,8 +729,11 @@ srcset + url_end); struct urlpos *up = append_url (url_text, base_ind + url_start, url_end - url_start, ctx); - up->link_inline_p = 1; - up->link_noquote_html_p = 1; + if (up) + { + up->link_inline_p = 1; + up->link_noquote_html_p = 1; + } xfree (url_text); }
signature.asc
Description: PGP signature