On 31/05/12 20:54, Preston Maness wrote: > Greetings, > > I first ran into this with my locally installed version of wget > (1.13.4) while attempting to archive a wordpress website. I then > compiled the latest development version
Thank you very much for your detailed report. > $ gdb ./wget > ... > (gdb) set args -d -o debug.log --html-extension --page-requisites -k > -e robots=off --exclude-directories=wiki,forums --reject > "*action=print" -w 1 --random-wait --warc-file=cpr-wp-debug > http://www.cyberpunkreview.com/movie/upcoming-movies/initial-impressions-review-of-solid-state-society/ > (gdb) run An even easier test-case: wget --convert-links "http://www.cyberpunkreview.com/movie/upcoming-movies/initial-impressions-review-of-solid-state-society/" > However, I have no idea where to go from here. I've filed a bug as > well with the log file and some gdb commands that I believe show a > null pointer dereference. The pointer "u" in convert.c is set to a > value of "0x0" at the time the program crashes: > > convert.c: > > (126) u = url_parse (cur_url->url->url, NULL, pi, true); > (127) local_name = hash_table_get (dl_url_file_map, u->url); > > The bug is located here: http://savannah.gnu.org/bugs/index.php?36570 The page contains http://[http://mlmlead.iphorum.com/]/, which is an invalid url. url_parse can return null in case there's an error parsing the url. convert.c is buggy assuming it will always suceed, and is thus segfaulting. See fix below. >From 9f3182017c16769b56a17bf70878fd566c1c6f79 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=81ngel=20Gonz=C3=A1lez?= <[email protected]> Date: Thu, 31 May 2012 22:57:41 +0200 Subject: [PATCH] fix segfault on wrong urls (bug 36570) --- ChangeLog | 4 ++++ src/convert.c | 5 +++++ 2 files changed, 9 insertions(+) diff --git a/ChangeLog b/ChangeLog index aa249b0..2f0f965 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,7 @@ +2012-05-31 Ángel González <[email protected]> + + * convert.c: fix segfault on wrong urls (bug 36570) + 2012-05-13 Giuseppe Scrivano <[email protected]> * bootstrap.conf (gnulib_modules): Add `git-version-gen'. diff --git a/src/convert.c b/src/convert.c index e1c58e9..3e10710 100644 --- a/src/convert.c +++ b/src/convert.c @@ -124,6 +124,11 @@ convert_links_in_hashtable (struct hash_table *downloaded_set, set_uri_encoding (pi, opt.locale, true); u = url_parse (cur_url->url->url, NULL, pi, true); + if (!u) + { + continue; + } + local_name = hash_table_get (dl_url_file_map, u->url); /* Decide on the conversion type. */ -- 1.7.10.2
