Re: Yet another unicode WTF

2009-06-05 Thread Ben Finney
Gabriel Genellina gagsl-...@yahoo.com.ar writes: Python knows the terminal encoding (or at least can make a good guess), but a file may use *any* encoding you want, completely unrelated to your terminal settings. It may, yes, and the programmer is free to specify any encoding. So when

Re: Yet another unicode WTF

2009-06-05 Thread Ned Deily
In article 8763fbmk5a@benfinney.id.au, Ben Finney ben+pyt...@benfinney.id.au wrote: Ned Deily n...@acm.org writes: $ python2.6 -c 'import sys; print sys.stdout.encoding, \ sys.stdout.isatty()' UTF-8 True $ python2.6 -c 'import sys; print sys.stdout.encoding, \

Re: Yet another unicode WTF

2009-06-05 Thread Paul Boddie
On 5 Jun, 03:18, Ron Garret rnospa...@flownet.com wrote: According to what I thought I knew about unix (and I had fancied myself a bit of an expert until just now) this is impossible.  Python is obviously picking up a different default encoding when its output is being piped to a file, but I

Re: Yet another unicode WTF

2009-06-05 Thread Paul Boddie
On 5 Jun, 11:51, Ben Finney ben+pyt...@benfinney.id.au wrote: Actually strings in Python 2.4 or later have the ‘encode’ method, with no need for importing extra modules: = $ python -c 'import sys; sys.stdout.write(u\u03bb\n.encode(utf-8))' λ $ python -c 'import sys;

Re: Yet another unicode WTF

2009-06-05 Thread Ned Deily
In article nad-31678a.00033005062...@ger.gmane.org, Ned Deily n...@acm.org wrote: In python 3.x, of course, the encoding happens automatically but you still have to tell python, via the encoding argument to open, what the encoding of the file's content is (or accept python's default which

Yet another unicode WTF

2009-06-04 Thread Ron Garret
Python 2.6.2 on OS X 10.5.7: [...@mickey:~]$ echo $LANG en_US.UTF-8 [...@mickey:~]$ cat frob.py #!/usr/bin/env python print u'\u03BB' [...@mickey:~]$ ./frob.py ª [...@mickey:~]$ ./frob.py foo Traceback (most recent call last): File ./frob.py, line 2, in module print u'\u03BB'

Re: Yet another unicode WTF

2009-06-04 Thread Lawrence D'Oliveiro
In message rnospamon-e7e08b.18181804062...@news.gha.chartermi.net, Ron Garret wrote: Python 2.6.2 on OS X 10.5.7: Same result, Python 2.6.1-3 on Debian Unstable. My $LANG is en_NZ.UTF-8. ... I always thought one of the fundamental invariants of unix processes was that there's no way for a

Re: Yet another unicode WTF

2009-06-04 Thread Ben Finney
Ron Garret rnospa...@flownet.com writes: According to what I thought I knew about unix (and I had fancied myself a bit of an expert until just now) this is impossible. Python is obviously picking up a different default encoding when its output is being piped to a file, but I always

Re: Yet another unicode WTF

2009-06-04 Thread Gabriel Genellina
En Thu, 04 Jun 2009 22:18:24 -0300, Ron Garret rnospa...@flownet.com escribió: Python 2.6.2 on OS X 10.5.7: [...@mickey:~]$ echo $LANG en_US.UTF-8 [...@mickey:~]$ cat frob.py #!/usr/bin/env python print u'\u03BB' [...@mickey:~]$ ./frob.py ª [...@mickey:~]$ ./frob.py foo Traceback (most

Re: Yet another unicode WTF

2009-06-04 Thread Ron Garret
In article h09ten$5q...@lust.ihug.co.nz, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message rnospamon-e7e08b.18181804062...@news.gha.chartermi.net, Ron Garret wrote: Python 2.6.2 on OS X 10.5.7: Same result, Python 2.6.1-3 on Debian Unstable. My $LANG is

Re: Yet another unicode WTF

2009-06-04 Thread Ben Finney
Ron Garret rnospa...@flownet.com writes: Python 2.6.2 on OS X 10.5.7: [...@mickey:~]$ echo $LANG en_US.UTF-8 [...@mickey:~]$ cat frob.py #!/usr/bin/env python print u'\u03BB' [...@mickey:~]$ ./frob.py ª [...@mickey:~]$ ./frob.py foo Traceback (most recent call last): File

Re: Yet another unicode WTF

2009-06-04 Thread Ned Deily
In article rnospamon-e7e08b.18181804062...@news.gha.chartermi.net, Ron Garret rnospa...@flownet.com wrote: Python 2.6.2 on OS X 10.5.7: [...@mickey:~]$ echo $LANG en_US.UTF-8 [...@mickey:~]$ cat frob.py #!/usr/bin/env python print u'\u03BB' [...@mickey:~]$ ./frob.py ª

Re: Yet another unicode WTF

2009-06-04 Thread Ben Finney
Ned Deily n...@acm.org writes: $ python2.6 -c 'import sys; print sys.stdout.encoding, \ sys.stdout.isatty()' UTF-8 True $ python2.6 -c 'import sys; print sys.stdout.encoding, \ sys.stdout.isatty()' foo ; cat foo None False So shouldn't the second case also detect UTF-8? The filesystem

Re: Yet another unicode WTF

2009-06-04 Thread Lawrence D'Oliveiro
In message mailman.1149.1244167714.8015.python-l...@python.org, Gabriel Genellina wrote: Python knows the terminal encoding (or at least can make a good guess), but a file may use *any* encoding you want, completely unrelated to your terminal settings. It should still respect your