"Siegfried Heintze" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]

Make sure you are using the Lucida Console font for the cmd.exe window and
type the commands:

chcp 1251
python -c "print ''.join(unichr(i) for i in range(0x410,0x431))"

Output:

?????????????????????????????????

Wowa! I was not aware of that chcp command! Thanks! How could I do that "chcp 1251" programatically?

The code was a little confusing because those two apostrophes look like a double quote!

But what are we doing here? Can you convince me that we are emitting UTF-8? I need UTF-8 because I need to experiment with some OS function calls that give me UTF-16 and I need to emit UTF-16 or UTF-8.

I think part of the problem is that Lucida Console is not as capable as "Arial Unicode MS" or the fonts used by urxvt-X.

In this case, it is not emitting UTF-8. It is emitting the windows-1251 encoding. As another poster mentioned, the Windows console gets an error when attempting to write UTF8 when the code page is 65001 (UTF8). But you can write output to a file explicitly in UTF-8 or UTF-16 and view the file with Notepad. I've used this method for processing Chinese.

import os,codecs
data = u''.join(unichr(i) for i in range(0x410,0x431))
codecs.open('out.txt','wt','utf-8').write(data)
os.startfile('out.txt')

P.S.

One way to set the code page programmatically is to use ctypes, but this will only work in a Windows console:

import ctypes
k=ctypes.WinDLL('kernel32')
x.SetConsoleOutputCP(1251)
1
print u''.join(unichr(i) for i in range(0x410,0x430)).encode('windows-1251')
АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ

--Mark

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to