On Thu, Sep 16, 2010 at 09:52:48AM -0400, Barry Warsaw wrote: > On Sep 16, 2010, at 11:28 PM, Nick Coghlan wrote: > > >There are some APIs that should be able to handle bytes *or* strings, > >but the current use of string literals in their implementation means > >that bytes don't work. This turns out to be a PITA for some networking > >related code which really wants to be working with raw bytes (e.g. > >URLs coming off the wire). > > Note that email has exactly the same problem. A general solution -- even if > embodied in *well documented* best-practices and convention -- would really > help make the stdlib work consistently, and I bet third party libraries too. > I too await a solution with abated breath :-) I've been working on documenting best practices for APIs and Unicode and for this type of function (take bytes or unicode and output the same type), knowing the encoding is seems like a requirement in most cases:
http://packages.python.org/kitchen/designing-unicode-apis.html#take-either-bytes-or-unicode-output-the-same-type I'd love to add another strategy there that shows how you can robustly operate on bytes without knowing the encoding but from writing that, I think that anytime you simplify your API you have to accept limitations on the data you can take in. (For instance, some simplifications can handle anything except ASCII-incompatible encodings). -Toshio
pgpAJSHDGRHtD.pgp
Description: PGP signature
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com