On Thu, Apr 25, 2013 at 7:43 AM, Antoine Pitrou <solip...@pitrou.net> wrote: > On Thu, 25 Apr 2013 04:19:36 +0200 > Lennart Regebro <rege...@gmail.com> wrote: >> On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull <step...@xemacs.org> >> wrote: >> > RFC 4648 repeatedly refers to *characters*, without specifying an >> > encoding for them. > [...] >> >> Base64 is an encoding that transforms between 8-bit streams. > > No, it isn't. What Stephen wrote above.
Yes it is. Base64 takes 8-bit bytes and transforms them into another 8-bit stream that can be safely transmitted over various channels that would mangle an unencoded 8-bit stream, such as email etc. http://en.wikipedia.org/wiki/Base64 >> Either you get a "LookupError: unknown >> encoding: base64", which is what you get now, or you get an >> UnicodeEncodingError if the text is not ASCII. We don't want the >> latter, because it means that code that looks fine for the developer >> breaks in real life because the developer was American > > That's bogus. No, that's real life. > By the same argument, we should suppress any > encoding which isn't able to represent all possible unicode strings. No, if you explicitly use such an encoding it is because you need to because you are transferring data to a system that needs the encoding in question. Unicode errors are unavoidable at that point, not an unexpected surprise because a conversion happened implicitly that you didn't know about. //Lennart _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com