"Yang" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
Hello,
I am trying to print out the hole unicode char list in window! form
0-65535.
I use the winxp in simple chinese LOCAL! the ascii form 0-127 and CJK
chars form
0X4E00-0X9FA4 can be print out! Other ucode chars case this error
"UnicodeEncodeError: 'gbk' codec can't encode character u'\u9fa6' in
position 0"
my code is here:
for i in range(0,65536 ):
uchar=unicode("\u%04X"%i,"unicode-escape")
print "%x :"%i,uchar
how can I achive a hole unicode list? Or can it be done?
Your console encoding is 'gbk', which can't display all the Unicode
characters. The following code can be used to generate all the characters
into a file using an encoding that supports all Unicode characters, and then
that file can be viewed in a program that supports the encoding (like
Notepad for this example). Still, what characters you see will depend on
the font used. Fonts generally do not support display of every Unicode
character.
import codecs
f=codecs.open('unicode.txt','wt',encoding='utf-8')
for i in xrange(32,0x10000): # skip control chars
if i < 0xD800 or i > 0xDFFF: # skip surrogate pair chars
f.write(u'%04X: %s\t' % (i,unichr(i)))
f.close()
-Mark
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor