On Wed, Aug 12, 2015 at 1:07 PM, Ken Hornstein wrote: >>It appears the basic processing model is a pipeline: >> >> Raw -> [Encoder] -> UTF8 -> [Processor] -> UTF8 -> [Encoder] -> Output ... > We're going to a point where UTF-8 is going to appear in email > addresses. That's technically allowed today under the new RFCs. The > problem then becomes "Okay, 'Output' in the above stage needs to be > 'Input' when doing message replies. How, exactly, do we do that?"
I see the Processor as nmh application logic: it will always operate in the UTF8 realm. The Encoders are basically I/O filters that are applied at the input and output stages. Take the reply command. The first thing it needs to do is read the original email data to generate the draft template for editing. The initial read operation is filtered thru the Encoder first. The result is passed into the nmh engine to parse header fields and other jazz to create the draft message (all of this is done in the UTF8 world). When writing the draft, the data is piped thru the encoder then written to disk before launching the editor (hopefully it is a no-op, but if in a non-UTF8 locale...). After editing, the draft is now the "Raw" input, repeating the pipeline again for whatever nmh is instructed to do with the draft. I know this may illicit some groans, but I work with Java daily. Internally, all strings are Unicode (technically it is not, but the difference is irrelevant for this discussion). It is the job of the I/O readers and writers to deal with conversion to and from non-Unicode to Unicode encodings. I.e. Before my application logic can do anything with textual information, it gets "converted" by a Reader, and the app only then deals with Unicode characters. When I write output, the Writer than converts to whatever the destination encoding is. Perl even supports a similar model of its I/O streams (if you choose to use it). --ewh P.S. Things may be a bit more complicated when dealing with MIME entity parsing, where each entity could be in a different encoding. In that case, each entity would have to be passed thru the encoder for normalization. _______________________________________________ Nmh-workers mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/nmh-workers
