Re: [Python-Dev] Support of UTF-16 and UTF-32 source encodings

2015-11-14 Thread eryksun
On Sat, Nov 14, 2015 at 7:06 PM, Steve Dower  wrote:
> The native encoding on Windows has been UTF-16 since Windows NT. Obviously
> we've survived without Python tokenization support for a long time, but
> every API uses it.

Windows 2000 was the first version to have broad support for UTF-16.
Windows NT (1993) was released before UTF-16, so its Unicode support
is limited to UCS-2.

(Note that console windows still restrict each character cell to a
single WCHAR character. So a non-BMP character encoded as a UTF-16
surrogate pair always appears as two box glyphs. Of course you can
copy and paste from the console to a UTF-16 aware window just fine.)

> I've hit a few cases where it would have been handy for Python to be able to
> detect it, though nothing I couldn't work around.

Can you elaborate some example cases? I can see using UTF-16 for the
REPL in the Windows console, but a hypothetical WinConIO class could
simply transcode to and from UTF-8. Drekin's win-unicode-console
package works like this.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Support of UTF-16 and UTF-32 source encodings

2015-11-14 Thread eryksun
On Sat, Nov 14, 2015 at 7:15 PM, Chris Angelico  wrote:
> Can the py.exe launcher handle a UTF-16 shebang? (I'm pretty sure Unix
> program loaders won't.) That alone might be a reason for strongly
> encouraging ASCII-compat encodings.

The launcher supports shebangs encoded as UTF-8 (default), UTF-16
(LE/BE), and UTF-32 (LE/BE):

https://hg.python.org/cpython/file/v3.5.0/PC/launcher.c#l1138
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP-8 wart... it recommends short names because of DOS

2015-10-21 Thread eryksun
On 10/21/15, Serhiy Storchaka  wrote:
> On 21.10.15 04:25, Gregory P. Smith wrote:
>> https://www.python.org/dev/peps/pep-0008/#names-to-avoid
>>
>> /"Since module names are mapped to file names, and some file systems are
>> case insensitive and truncate long names, it is important that module
>> names be chosen to be fairly short -- this won't be a problem on Unix,
>> but it may be a problem when the code is transported to older Mac or
>> Windows versions, or DOS."/
>>
>> There haven't been computers with less than 80 character file or path
>> name element length limits in wide use in decades... ;)
>
> We should also avoid special file names like con.py or lpt1.py.

Other file names to avoid on Windows are conin$.py, conout$.py,
aux.py, prn.py, nul.py, lpt[1-9].py, and com[1-9].py.

Using these device names in a file name requires the fully qualified
wide-character path, prefixed by \\?\. Incidentally this prefix also
allows paths that have up to 32768 characters, if there's concern that
long module names in packages might exceed the Windows 260-character
limit.

Here's an example of what would actually be opened for con.py, etc, at
least on my current Windows 10 machine:

devs = ('aux prn com1 com9 lpt1 lpt9 '
'nul con conin$ conout$'.split())

for dev in devs:
ntpath = to_nt(r'C:\%s.py' % dev)
print(ntpath.ljust(11), '=>' ,query_link(ntpath))

output:

\??\aux => \DosDevices\COM1
\??\prn => \DosDevices\LPT1
\??\com1=> object name not found
\??\com9=> object name not found
\??\lpt1=> \Device\Parallel0
\??\lpt9=> object name not found
\??\nul => \Device\Null
\??\con => \Device\ConDrv\Console
\??\conin$  => \Device\ConDrv\CurrentIn
\??\conout$ => \Device\ConDrv\CurrentOut

The \\?\ prefix avoids DOS name translation. The only change made by
the system is to replace \\?\ with \?? in the path:

for dev in devs:
print(to_nt(r'\\?\C:\%s.py' % dev))

output:

\??\C:\aux.py
\??\C:\prn.py
\??\C:\com1.py
\??\C:\com9.py
\??\C:\lpt1.py
\??\C:\lpt9.py
\??\C:\nul.py
\??\C:\con.py
\??\C:\conin$.py
\??\C:\conout$.py

On this machine, \??\C: is a link to \Device\HarddiskVolume2.

(to_nt and query_link call the native API functions
RtlDosPathNameToNtPathName_U, NtOpenSymbolicLinkObject, and
NtQuerySymbolicLinkObject. Note that Microsoft doesn't support calling
the native NT API from applications in user mode.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com