On Mar 29, 2010, at 11:52 43AM, Nicolas Cellier wrote: > 2010/3/29 Henrik Johansen <[email protected]>: >> >> On Mar 29, 2010, at 11:16 30AM, Nicolas Cellier wrote: >> >>> I presume that under the idiom "latin1" you refer to code page 1252 >>> rather than iso8859-L1, right ? >>> >>> Nicolas >> Good question :) >> What IS the presumed internal encoding of Bytestrings in Squeak? >> That's the one I meant, I merely assumed it was latin1 seeing as how the >> text converter refers to it as such. >> Personally I thought it was iso8859-L1, seeing as the bytestring to unicode >> conversion does a simple shift of chars > 127 to the 0080 - 00FF range. >> >> Cheers, >> Henry >> > > From what I understood, CP1252 is Microsoft "latin1" and use codes 128 to 159. > ISO8859-L1 match fisrt 256 codes of unicode latin-1 and has codes 128 > to 159 unused. > You know, when Microsoft "uses" a standard, it's always a better standard ;) > > I have nothing against CP1252, it's an optimization which avoid > wasting 32 cheap codes. > But I'm not sure about various compatibility issues in/with the > external world... > > Squeak clearly uses CP1252. > For Pharo, there might be a mix of the two since Sophie-like > refactorings. Surely what John was refering to. > > Nicolas
Ummm... All the utf8-converters in squeak use Unicode value:, which maps directly from charCode 128->255 to Unicode value 128->255. Unicode value 128->255 IS iso8859-L1, so if squeak uses CP1252 as internal format, all the converters in Squeak are wrong. Cheers, Henry _______________________________________________ Pharo-project mailing list [email protected] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
