On 2012-01-21, Uday Reddy <usr.vm.ro...@gmail.com> wrote: > [Julian sent in a response yesterday, but it wasn't copied to mailing list. > Do you want to re-send it to the mailing list, Julian?]
That's because you keep replying to me personally when you reply to my posts, instead of keeping it on the mailing list where it belongs, so I see it in mail and reply there before I see it on the list. > To re-state what I wrote last night, I am thinking of VM folders made up of > *characters*. They could be used with a new file extension suffix, e.g., > ".vm". On the disk, they would be in some coding system such as UTF-8. In other words, you'd transcode everything to utf-8, and then say "we know all messages in the folder are in utf-8, so we can load the folder in utf-8 and bypass per-message decoding", as I remarked several messages ago. > be in the default coding system of the folder. No "charset" headers. All Why no charset headers? If you're munging a mime message, you should ensure that it remains a valid mime message. > People being people, they will also want to save messages from ".vm" folders > into byte folders. We either prohibit that, or re-encode the messages as > proper MIME messages before saving into byte folders. Why do this? If you transcode explicitly, the messages are still proper MIME messages, even if they're in a "character folder" (i.e. a folder which VM loads using utf-8 (or whatever) coding system rather than binary). I don't see why this shouldn't work, but one shouldn't do it by default. Although Unicode by requirement has an injective mapping from every legacy standard, it's not the case that Unicode has an injective mapping from the disjoint union of the legacy standards (Emacs does, internally). Some Japanese users have very strong feelings about some of the Japanese/Chinese merges done in Unicode.