To give a bit of context, the problem is:
-=-=-=-=-=-=-=-=-=-=-=-=
exampleEncodedXML
^'<?xml version="1.0" encoding="UTF-8"?>
<test-data>…</test-data>
'
testDecodingCharacters
| xmlDocument element |
"XMLTokenizer testDecodingCharacters"
xmlDocument := XMLDOMParser parseDocumentFrom: self exampleEncodedXML
readStream.
element := xmlDocument firstTagNamed: #'test-data'.
self assert: element contentString first codePoint = 8230
-=-=-=-=-=-=-=-=-=-=-=-=
#testDecodingCharacters goes yellow
> Thinking of it, it's not really an encoding problem, rather a bug in
> the entity->character conversion. I guess there should be a similar
> test where there is an actual ellipsis character in the xml, instead
> of the entity.
Any idea how your test can goes green?
> And now I realize our server will not be able to connect outside its
> DMZ, so I won't be able to use the fix :D
DMZ ?
Cheers,
Alexandre
>
>
>
>
>> On 16 May 2010, at 13:35, Damien Pollet wrote:
>>
>>> Hi,
>>>
>>> I have a failing test to show the problem, but I can't commit to the
>>> XMLSupport squeaksource, so I attach the MCZ here.
>>> Basically, if I parse an UTF-8 document with an entity like …
>>> (ellipsis), I don't get a Character with the correct #codePoint.
>>>
>>> Cheers,
>>>
>>> --
>>> Damien Pollet
>>> type less, do more [ | ] http://people.untyped.org/damien.pollet
>>> <XML-Parser-DamienPollet.75.mcz>
>>
>> --
>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
>> Alexandre Bergel http://www.bergel.eu
>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
>>
>>
>>
>>
>>
>>
>
>
>
> --
> Damien Pollet
> type less, do more [ | ] http://people.untyped.org/damien.pollet
_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project