Hi,
Here's a tiny fix for a problem in the HTML parsing in html-url.c.
Wget crashes on HTML files that contain an incomplete STYLE tag, e.g.:
<style </style>
If it finds one of those, it calls get_urls_css with an invalid buffer
(the buffer has a negative length), which leads to this crash:
bad buffer in yy_scan_bytes()
ERROR (2)
The attached patch checks the buffer before calling get_urls_css. The
content of the incomplete tag still won't be parsed, but at least it
will no longer lead to a crash.
Regards,
Gijs
=== modified file 'src/ChangeLog'
--- src/ChangeLog 2012-03-29 18:13:27 +0000
+++ src/ChangeLog 2012-04-01 20:35:28 +0000
@@ -1,3 +1,7 @@
+2012-04-01 Gijs van Tulder <gvtul...@gmail.com> (tiny change)
+
+ * html-url.c: Prevent crash on incomplete STYLE tag.
+
2012-03-29 From: Tim Ruehsen <tim.rueh...@gmx.de> (tiny change)
* utils.c (library): Include <sys/time.h>.
=== modified file 'src/html-url.c'
--- src/html-url.c 2011-04-24 11:03:48 +0000
+++ src/html-url.c 2012-04-01 16:08:18 +0000
@@ -676,7 +676,8 @@
check_style_attr (tag, ctx);
if (tag->end_tag_p && (0 == strcasecmp (tag->name, "style")) &&
- tag->contents_begin && tag->contents_end)
+ tag->contents_begin && tag->contents_end &&
+ tag->contents_begin <= tag->contents_end)
{
/* parse contents */
get_urls_css (ctx, tag->contents_begin - ctx->text,