On 06/30/2015 02:47 PM, Tim Ruehsen wrote:
Thanks Hubert, good catch.
I add the amended patch for completeness.
Tim
Hi Tim,
I amended some enhancements on top of your patch.
- Check sequences of length 5 and 6. I don't see any reason not to check them.
- Removed the call to quote() at idn_encode() since it segfaults with 0xFC
(the same test proposed by the reporters of #45236).
- General reworking to avoid code repetition.
It fixes #45236 and passes all the tests.
--
Regards,
- AJ
>From d10bdeeb31225238f47a9002e891c62c06cd6a47 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tim=20R=C3=BChsen?= <[email protected]>
Date: Tue, 30 Jun 2015 09:55:14 +0200
Subject: [PATCH] Work around a libidn <= 1.30 vulnerability
* src/iri.c: Add _utf8_is_valid() to check UTF-8 sequences before
passing them to idna_to_ascii_8z().
---
src/iri.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
diff --git a/src/iri.c b/src/iri.c
index 10ae994..faeca90 100644
--- a/src/iri.c
+++ b/src/iri.c
@@ -219,6 +219,51 @@ locale_to_utf8 (const char *str)
return str;
}
+/*
+ * Work around a libidn <= 1.30 vulnerability.
+ *
+ * The function checks for a valid UTF-8 character sequence before
+ * passing it to idna_to_ascii_8z().
+ *
+ * [1] http://lists.gnu.org/archive/html/help-libidn/2015-05/msg00002.html
+ * [2] https://lists.gnu.org/archive/html/bug-wget/2015-06/msg00002.html
+ * [3] http://curl.haxx.se/mail/lib-2015-06/0143.html
+ */
+static bool
+_utf8_is_valid(const char *utf8)
+{
+ int i, offset = 0;
+ const unsigned char *s = (const unsigned char *) utf8;
+
+ while (*s)
+ {
+ if ((*s & 0x80) == 0) /* 0xxxxxxx ASCII char */
+ offset = 1;
+ else if ((*s & 0xE0) == 0xC0) /* 110xxxxx 10xxxxxx */
+ offset = 2;
+ else if ((*s & 0xF0) == 0xE0) /* 1110xxxx 10xxxxxx 10xxxxxx */
+ offset = 3;
+ else if ((*s & 0xF8) == 0xF0) /* 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx */
+ offset = 4;
+ else if ((*s & 0xFC) == 0xF8)
+ offset = 5;
+ else if ((*s & 0xFE) == 0xFC)
+ offset = 6;
+ else
+ return false;
+
+ for (i = 1; i < offset; i++)
+ {
+ if ((s[i] & 0xC0) != 0x80)
+ return false;
+ }
+
+ s += offset;
+ }
+
+ return true;
+}
+
/* Try to "ASCII encode" UTF-8 host. Return the new domain on success or NULL
on error. */
char *
@@ -235,6 +280,13 @@ idn_encode (struct iri *i, char *host)
return NULL; /* Nothing to encode or an error occured */
}
+ if (!_utf8_is_valid(utf8_encoded ? utf8_encoded : host))
+ {
+ xfree (utf8_encoded);
+ logprintf (LOG_VERBOSE, _("Invalid UTF-8 sequence\n"));
+ return NULL;
+ }
+
/* Store in ascii_encoded the ASCII UTF-8 NULL terminated string */
ret = idna_to_ascii_8z (utf8_encoded ? utf8_encoded : host, &ascii_encoded, IDNA_FLAGS);
xfree (utf8_encoded);
--
1.9.1