On 23.05.2013 00:06, Nicolas Cellier wrote:
That sounds good. We could even try to fallback to UT-32 if we encounter zeros (but his should be very rare...).

For write, ZipArchive are un-aware of any encoding... They use latin1.
In Squeak, I could place some squeakToUTF8 sends in MCMczWriter, and equivalent UTF8TextConverter in Pharo #serializeDefinitions:, maybe this is needed in some other serialize* (version, dependencies who knows...)
That won't work, if the file contained sources for both widestring and bytestring sourced methods. In which case the file would contain code stored BOTH as latin1 bytes, and (same endianness as platform saved from) UTF32. Which means you'd have to detect and handle jumps back and forth in encoding when reading...
IMHO, just consider those files lost beyond hope.

Cheers,
Henry

Reply via email to