Ulli Horlacher wrote:
> On Tue 2015-12-15 (11:10), Tim Roberts wrote:
>
>>> I have a python 2.7 program which runs in a console window and upload 
>>> files. 
>>> To specify the files, the user uses Windows drag&drop (via explorer) or 
>>> copy&paste.
>> This is hopeless.  In addition to the normal difficulties in
>> string/Unicode conversions, you have the added limitations of the
>> current console code page.  You simply cannot type characters in a
>> Windows console that are not present in your current code page.
> What is the current code page?

The Windows console shell is an 8-bit entity.  That means you only have
256 characters available at any given time, similar to they way
non-Unicode strings work in Python 2.  The "code page" determines which
256 you are using.  You can type "chcp" to see the current code page. 
By default, you get code page 437, which derived from the original IBM
PC character ROMs.  You simply cannot type or display characters that
are not in the current code page.

Using msvcrt.getchw does not convert the console to a Unicode entity. 
It merely means the characters you DO get are represented in Unicode.

The Windows console theoretically supports a UTF-8 code page (chcp
65001), and it does fix many of these problems, but there are some
console apps that won't like it.


> drag&drop of a file with a non-ASCII filename from the explorer into a
> "naked" console window works:
>
> http://fex.rus.uni-stuttgart.de/fop/dLxrxPBV/X-201512152036.png
>
> Is there no way to catch it with Python?

There may be, as long as you can figure out the translation.   What do
you actually get when you try your example?  I created a file on my
machine called x⌡x.xxx, where that middle glyph is code 0xF4 in code
page 437.  When I drag and drop that to your program, it displayed
correctly.  However, everything is not perfect.  If I copy that name
from the console window and paste it into your app, I get xôx.xxx.  
Here, the middle glyph is 0x93 in code page 437, but it is U+00F4 in
Unicode.  Somebody converted the string to Unicode by simple widening,
and then converted back to CP437.

I get the same result if I paste my file name here directly:  xôx.xxx

-- 
Tim Roberts, t...@probo.com
Providenza & Boekelheide, Inc.

_______________________________________________
python-win32 mailing list
python-win32@python.org
https://mail.python.org/mailman/listinfo/python-win32

Reply via email to