Re: [Pharo-dev] MC should really write snaphsot/source.st in UTF8

Nicolas Cellier Thu, 23 May 2013 03:53:39 -0700

The snapshot/source.st does not contain a mix of ByteString and WideString
because a single String is written during the process (all code is written
into a String new writeStream which will make the String wide at first wide
Character), so it should work.



2013/5/23 Henrik Sperre Johansen <[email protected]>

> On 23.05.2013 00:06, Nicolas Cellier wrote:
>
>> That sounds good. We could even try to fallback to UT-32 if we encounter
>> zeros (but his should be very rare...).
>>
>> For write, ZipArchive are un-aware of any encoding... They use latin1.
>> In Squeak, I could place some squeakToUTF8 sends in MCMczWriter, and
>> equivalent UTF8TextConverter in Pharo #serializeDefinitions:, maybe this is
>> needed in some other serialize* (version, dependencies who knows...)
>>
> That won't work, if the file contained sources for both widestring and
> bytestring sourced methods.
> In which case the file would contain code stored BOTH as latin1 bytes, and
> (same endianness as platform saved from) UTF32.
> Which means you'd have to detect and handle jumps back and forth in
> encoding when reading...
> IMHO, just consider those files lost beyond hope.
>
> Cheers,
> Henry
>
>

Re: [Pharo-dev] MC should really write snaphsot/source.st in UTF8

Reply via email to