Hi all, lately I started to see an out-of-index error when parsing Unicode text from XML files. I am *not* 100% sure this isn't due to some changes database-side that are now exposing a wider amount of text to the parser, so I cannot safely claim it's new. Yet, now even common accented Latin chars get warped into something unusable when read from the parser, and this *surely* was not happening the last time I worked on the interface, say 3 months ago, with Iliad 7.0.
Now I'm using gst 3.2 and iliad 0.8. What I get from the following code: content := 'taxonomy.xml' asFile. parser := XML.XMLParser new. parser validate: false. parser parse: content readStream. is an error you can easily reply by putting http://eng.i-iter.org/graph/taxonomy.xml file into your local dir. Before you get crazy (as I did) digging around the text looking for the guilty chars I can tell you the breakers are, for example: 1)...the æ and œ ligatures, ... 2) Devanāgarī script for Hindi 3) Japanese Rōmaji script I was wondering what changed... or, most probably, what kind of silly mistake I'm making... Bèrto -- ============================== Constitution du 24 juin 1793 - Article 35. - Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacré des droits et le plus indispensable des devoirs. _______________________________________________ help-smalltalk mailing list [email protected] http://lists.gnu.org/mailman/listinfo/help-smalltalk
