On Tue, Jun 17, 2014 at 1:23 PM, Ken Hornstein wrote: >>So you are saying that "normal unix commands", such as grep, wc, tr >>etc, do or someday the GNU versions will, know about UTF-8, at least >>for file contents, if not for file names? ... > There's an implicit assumption in nmh that messages in the message store > are valid RFC 5322 messages and can always be treated as such (see > dist and forw, for starters).
Some anecdotal experience that may be of interest: I've had to deal with messages that have non-ASCII messages in headers, so they can occur in the wild, and usually occur in non-English locales, but can still occur in English locales where special characters (e.g. English pound, euro) are used. In a program I developed that has to parse emails, I had to provide a configuration option that instructed the program what the default character encoding should be when parsing message headers because of this. The MIME RFCs say US-ASCII is the default, but the real world indicates this is not always the case. Not sure what nmh does when encountering such data. As for message storage, nothing prohibits nmh from auto-converting (aka normalizing) non-ASCII encoded data to UTF-8 when storing the message. The underlying message parsing tools of nmh should not be affected (but others would have to confirm this). This would allow standard Unix tools, or other tools like search indexing tools, to process the files w/o having to do full MIME-aware parsing. Also, it would avoid the on-the-fly decoding of non-ASCII headers by nmh each time it reads a message (for pick, show, scan, etc). Noramlizing a message headers may be a problem for cases where message headers may be signed (e.g. DKIM) and if there is a desire to reverify such signatures later. Unsure if this is something that is of a real concern. If normalization was ever to be supported in nmh, it should be a configurable option so those concerned of such scenarios are assured that the message data is left as-is. --ewh _______________________________________________ Nmh-workers mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/nmh-workers
