Julian Bradfield wrote at 22:06 +0000 on Jan 19, 2012: > On 2012-01-19, Uday Reddy <[email protected]> wrote: > > More generally, I am thinking that there is no reason why we can't have VM > > folders stored in some other character set, other than US-ASCII, e.g., > > UTF-8. Those folders won't be interoperable with other mail clients, but > > do > .... > > careful reengineering effort. The assumption about 7-bit US-ASCII is > > probably pervasive in a lot of VM code. So, it will need extensive > > testing, > > I have no idea what you're talking about! VM makes no assumptions at > all about the character set of its folders, except that that message > headers are (as required) in ASCII - many of my patches over > the last few years have been removing the accidental cases where it > failed to enforce its agnosticism. > > VM folders are simply binary files. The character set of a given > message - or subpart of a message - is determined by its MIME charset. > > If you wanted, you could transcode all non-utf-8 parts to utf-8, but > the folder would still be a binary file; it would just be a binary > file that happened also to be valid utf-8 as a whole.
I hope vm can handle binary bodies okay. I suspect there may be some edge cases (e.g., embedded '\n\nFrom ' in binary data - using something other than mbox format could help there). But there are ramifications to having raw binary data in email messages beyond the scope of vm. Imagine reading a message with raw binary data in an xterm with emacs -nw - see your terminal window go catty-wompus with the right combination of bytes. Or through a telnet session. Also trying to send non 7-bit ascii in a message through some mailers might cause issues (possibly less so in this day and age).
