jefm wrote:
Hi,
while checking out Python 3, I read that all text strings are now
natively Unicode.
True
In the Python language reference (http://docs.python.org/3.0/reference/
lexical_analysis.html) I read that I can show Unicode character in
several ways.
"\uxxxx" supposedly allows me to specify the Unicode character by hex
number and the format "\N{name}" allows me to specify by Unicode
name.
These are ways to *specify* unicode chars on input.
Neither seem to work for me.
If you separate text creation from text printing, you would see that
they do. Try
s='\u20ac'
print(s)
What am I doing wrong ?
Using the interactive interpreter running in a Windows console.
Please see error output below where I am trying to show the EURO sign
(http://www.fileformat.info/info/unicode/char/20ac/index.htm):
Python 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
print('\u20ac')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\python30\lib\io.py", line 1491, in write
b = encoder.encode(s)
File "c:\python30\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u20ac' in
position 0: character maps to <undefined>
With the standard console, I get the same. But with IDLE, using the
same Python build but through a different interface
>>> s='\u20ac'
>>> len(s)
1
>>> str(s)
'€' # euro sign
I have fiddled with the shortcut to supposed make it work better as
claimed by posts found on the web, but to no avail. Very frustrating
since I have fonts on the system for at least all of the first 64K
chars. Scream at Microsoft or try to find or encourage a console
replacement that Python could use. In the meanwhile, use IDLE. Not
perfect for Unicode, but better.
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list