Hi,

Here's a tiny fix for a problem in the HTML parsing in html-url.c.

Wget crashes on HTML files that contain an incomplete STYLE tag, e.g.:

  <style </style>

If it finds one of those, it calls get_urls_css with an invalid buffer (the buffer has a negative length), which leads to this crash:

  bad buffer in yy_scan_bytes()
  ERROR (2)

The attached patch checks the buffer before calling get_urls_css. The content of the incomplete tag still won't be parsed, but at least it will no longer lead to a crash.

Regards,

Gijs
=== modified file 'src/ChangeLog'
--- src/ChangeLog	2012-03-29 18:13:27 +0000
+++ src/ChangeLog	2012-04-01 20:35:28 +0000
@@ -1,3 +1,7 @@
+2012-04-01  Gijs van Tulder  <gvtul...@gmail.com> (tiny change)
+
+	* html-url.c: Prevent crash on incomplete STYLE tag.
+
 2012-03-29  From: Tim Ruehsen <tim.rueh...@gmx.de> (tiny change)
 
 	* utils.c (library): Include <sys/time.h>.

=== modified file 'src/html-url.c'
--- src/html-url.c	2011-04-24 11:03:48 +0000
+++ src/html-url.c	2012-04-01 16:08:18 +0000
@@ -676,7 +676,8 @@
   check_style_attr (tag, ctx);
 
   if (tag->end_tag_p && (0 == strcasecmp (tag->name, "style")) &&
-      tag->contents_begin && tag->contents_end)
+      tag->contents_begin && tag->contents_end &&
+      tag->contents_begin <= tag->contents_end)
   {
     /* parse contents */
     get_urls_css (ctx, tag->contents_begin - ctx->text,

Reply via email to