To give a bit of context, the problem is:

-=-=-=-=-=-=-=-=-=-=-=-=
exampleEncodedXML
        ^'<?xml version="1.0" encoding="UTF-8"?>
<test-data>&#8230;</test-data>
'

testDecodingCharacters
        | xmlDocument element |
        "XMLTokenizer testDecodingCharacters"

        xmlDocument := XMLDOMParser parseDocumentFrom: self exampleEncodedXML 
readStream.
        element := xmlDocument firstTagNamed: #'test-data'.
        
        self assert: element contentString first codePoint = 8230
-=-=-=-=-=-=-=-=-=-=-=-=

#testDecodingCharacters goes yellow

> Thinking of it, it's not really an encoding problem, rather a bug in
> the entity->character conversion. I guess there should be a similar
> test where there is an actual ellipsis character in the xml, instead
> of the entity.

Any idea how your test can goes green?

> And now I realize our server will not be able to connect outside its
> DMZ, so I won't be able to use the fix :D

DMZ ?

Cheers,
Alexandre

> 
> 
> 
> 
>> On 16 May 2010, at 13:35, Damien Pollet wrote:
>> 
>>> Hi,
>>> 
>>> I have a failing test to show the problem, but I can't commit to the
>>> XMLSupport squeaksource, so I attach the MCZ here.
>>> Basically, if I parse an UTF-8 document with an entity like &#8230;
>>> (ellipsis), I don't get a Character with the correct #codePoint.
>>> 
>>> Cheers,
>>> 
>>> --
>>> Damien Pollet
>>> type less, do more [ | ] http://people.untyped.org/damien.pollet
>>> <XML-Parser-DamienPollet.75.mcz>
>> 
>> --
>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
>> Alexandre Bergel  http://www.bergel.eu
>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> Damien Pollet
> type less, do more [ | ] http://people.untyped.org/damien.pollet


_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to