Re: Yet another unicode WTF

Paul Boddie Fri, 05 Jun 2009 04:11:25 -0700

On 5 Jun, 11:51, Ben Finney <ben+pyt...@benfinney.id.au> wrote:
>
> Actually strings in Python 2.4 or later have the ‘encode’ method, with
> no need for importing extra modules:
>
> =====
> $ python -c 'import sys; sys.stdout.write(u"\u03bb\n".encode("utf-8"))'
> λ
>
> $ python -c 'import sys; sys.stdout.write(u"\u03bb\n".encode("utf-8"))' > foo 
> ; cat foo
> λ
> =====


Those are Unicode objects, not traditional Python strings. Although
strings do have decode and encode methods, even in Python 2.3, the
former is shorthand for the construction of a Unicode object using the
stated encoding whereas the latter seems to rely on the error-prone
automatic encoding detection in order to create a Unicode object and
then encode the result - in effect, recoding the string.

As I noted, if one wants to remain sane and not think about encoding
everything everywhere, creating a stream using a codecs module
function or class will permit the construction of something which
deals with Unicode objects satisfactorily.

Paul
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Yet another unicode WTF

Reply via email to