Sean,
Your XML file is not UTF-8 encoded, it is plain Unicode. At least the way it is
served from the URL you gave.
(('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl
retrieveContents) at: 72 ) = 160 asCharacter.
"true"
Like you said,
160 asCharacter asString utf8Encoded.
"#[194 160]"
But
#[ 160 ] utf8Decoded.
Boom!
You specify UTF-8 encoding inside your XML, I assume the parser then switches
to that encoding, but your pure Unicode contents is not UTF-8 encoded and
results in an exception. You see ?
Sven
> On 28 Jul 2016, at 22:05, Sean P. DeNigris <[email protected]> wrote:
>
> monty-3 wrote
>> Just to be sure, I manually recreated your file (with the great Bless hex
>> editor) and parsed it with no issue.
>
> Thanks!
>
>
> monty-3 wrote
>> Please post your code and attach the actual source as a file separately.
>
> The code is merely:
> messageLog := FileLocator home / 'illegal-UTF-sms.xml'.
> doc := XMLDOMParser parse: messageLog.
>
> File: illegal-UTF-sms.xml
> <http://forum.world.st/file/n4908531/illegal-UTF-sms.xml>
>
>
>
> -----
> Cheers,
> Sean
> --
> View this message in context:
> http://forum.world.st/XMLParser-Claims-U-00A0-is-Invalid-UTF-8-tp4908525p4908531.html
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
>