Posted to StackOverflow
(https://stackoverflow.com/questions/38645553/xmlparser-in-pharo-claims-u00a0-is-invalid-utf-8):
Given the input:
<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<sms body=". what" />
Where the character after the "." in the body attribute of the sms tag is
U+00A0;
I get the error:
XMLEncodingException: Invalid UTF-8 character encoding (line 2) (column
13)
IIUC, the UTF-8 representation of that character is 0xC2 0xA0 per Wikipedia.
Sure enough, bytes 72 and 73 of the input are 194 and 160 respectively.
This seems like a bug in XMLParser, or am I missing something?
-----
Cheers,
Sean
--
View this message in context:
http://forum.world.st/XMLParser-Claims-U-00A0-is-Invalid-UTF-8-tp4908525.html
Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.