Stephen J. Turnbull wrote: > Bengt> The characters in b could be encoded in plain ascii, or > Bengt> utf16le, you have to know. > > Which base64 are you thinking about? Both RFC 3548 and RFC 2045 > (MIME) specify subsets of US-ASCII explicitly.
Unfortunately, it is ambiguous as to whether they refer to US-ASCII, the character set, or US-ASCII, the encoding. It appears that RFC 3548 talks about the character set only: - section 2.4 talks about "choosing an alphabet", and how it should be possible for humans to handle such data. - section 2.3 talks about non-alphabet characters So it appears that RFC 3548 defines a conversion bytes->text. To transmit this, you then also need encoding. MIME appears to also use the US-ASCII *encoding* ("charset", in IETF speak), for the "base64" Content-Transfer-Encoding. For an example where base64 is *not* necessarily ASCII-encoded, see the "binary" data type in XML Schema. There, base64 is embedded into an XML document, and uses the encoding of the entire XML document. As a result, you may get base64 data in utf16le. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com