Re: [Pharo-dev] MC should really write snaphsot/source.st in UTF8

Nicolas Cellier Wed, 22 May 2013 15:07:46 -0700

That sounds good. We could even try to fallback to UT-32 if we encounter
zeros (but his should be very rare...).


For write, ZipArchive are un-aware of any encoding... They use latin1.
In Squeak, I could place some squeakToUTF8 sends in MCMczWriter, and
equivalent UTF8TextConverter in Pharo #serializeDefinitions:, maybe this is
needed in some other serialize* (version, dependencies who knows...)



2013/5/22 Norbert Hartl <[email protected]>

>
>
> Am 22.05.2013 um 23:16 schrieb Nicolas Cellier <
> [email protected]>:
>
> First thing would be to simplify #setConverterForCode and
> #selectTextConverterForCode.
> Do we still want to use a MacRomanTextConverter, seriously? I'm not even
> sure I've got that many files with that encoding on my Mac-OSX...
> Do we really need to put a ByteOrderMark for UTF-8, seriously? See
> http://en.wikipedia.org/wiki/Byte_order_mark, it's valueless, and not
> recommended. It were a Squeak way to specify that a Squeak source file
> would use UTF-8 rather than MacRoman, but now this should be obsolescent.
>
>
> A BOM for utf-8 does not make sense. It could act as a switch between
> legacy encoding and utf-8. But it would also be a decision that will be
> regretted shortly after. Most files in monticello are 7bit so there
> wouldn't be a problem changing the default encoding. For every other file
> an exception will be thrown. So reading utf-8 and on exception reading the
> same thing in legacy might be a way to go.
>
> Norbert
>
>
>
> 2013/5/22 Nicolas Cellier <[email protected]>
>
>>
>> http://stackoverflow.com/questions/16645848/squeak-monticello-character-encoding
>> Let's kill this one, it's totally insane
>>
>
>

Re: [Pharo-dev] MC should really write snaphsot/source.st in UTF8

Reply via email to