It may be helpful to have a brief summary of the main problem that's being fixed.

Previously, the C/C++ version of libphonenumber was accepting and parsing phone numbers that have malformed UTF-8 sequences in them, by converting the offending bytes to spaces. It now rejects the input instead of returning a phone number, which the Java version has always done. Accepting malformed UTF-8 is a potential security issue.

libphonenumber was also accepting well-formed input containing invalid code points like U+0096 (a C1 control character) which can be the result of a bad conversion from Windows 1252 legacy encoding where N DASH (U+2013) is represented by \x96. If the legacy text is treated as iso-8859-1 instead of windows-1252, \x96 will be converted to U+0096 instead of U+2013. This type of input is now rejected as well.

Reply via email to