> > Let's take an example: multipart (MIME) email with latin-1 and > > base64 (ascii) > > sections. Mix latin-1 and ascii => mix bytes. So the best type > > should be > > bytes. > > > > => bytes > > Except that by the time they're parsed into an email message, they > must be ascii, either encoded as base64 or quoted-printable. We also > have to know at that point the charset being used, so I think it > makes sense to keep everything as strings.
Actually, Victor's right here -- it makes more sense to treat them as bytes. It's RFC 821 (SMTP) that requires 7-bit ASCII, not the MIME format. Non-SMTP mail transports do exist, and are popular in various places. Email transported via other transport mechanisms may, for instance, use a Content-Transfer-Encoding of "binary" for some sections of the message. Some parts of the top-most header of the message may be counted on to be encoded as ASCII strings, but not the whole message in general. > > About base64, I agree with Bill Janssen: > > - base64MIME.decode converts string to bytes > > - base64MIME.encode converts bytes to string > > I agree. > > > But decode may accept bytes as input (as base64 modules does): use > > str(value, 'ascii', 'ignore') or str(value, 'ascii', 'strict'). > > Hmm, I'm not sure about this, but I think that .encode() may have to > accept strings. Personally, I think it would avoid more errors if it didn't. Let the user explicitly encode the string to a particular representation before calling base64.encode(). Bill _______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com