Kyle J. McKay:

+       if (!*charset)
+               *charset = xstrdup("iso8859-1");

Actually the name should be "ISO-8859-1". See RFC 2616 section 3.7.1. Since it's case insensitive "iso-8859-1" would be fine too.

You'd be amazed at what you see in the wild... I'd recommend going with the recommended algorithm from WHATWG's Encoding Standard, if you want to make this robust: <http://encoding.spec.whatwg.org/#names-and-labels>.

The spec is partly based on a lot of research I made in my previous $DAYJOB, with a lot of research added by the spec writer.

There is also Unicode's attempt at it, but it does unfortunately produce too many false positives: <http://www.unicode.org/reports/tr22/tr22-7.html#Charset_Alias_Matching>

--
\\// Peter - http://www.softwolves.pp.se/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to