[Python-ideas] Re: Please update shutil.py copyfileobj to include code mentioned in below issues

2023-06-08 Thread Eryk Sun
On 6/8/23, jsch...@sbcglobal.net  wrote:
> I opened two issues regarding copyfileobj that were not bugs, but a fix that
> was involved helped me figure out I needed a new external drive, since it
> displayed the error number from the copyfileobj function.  I'd like a
> modified version of this code implemented permanently in shutil.py so others
> could see if they have the same issue as me.

I don't know how many people still subscribe to and read this mailing
list. More people would see this suggestion if you posted this on
discuss.python.org/c/ideas.

> This is the original issue that has the code I was using that Eryksun
> posted.
>
> https://github.com/python/cpython/issues/96721
>
> Here's the second issue where it happened again.  I put the error message in
> this post, so you can see how it helped me.  Also, the code might need to be
> modified slightly, since it generated an error.
>
> https://github.com/python/cpython/issues/102357

The ctypes code that I provided was only for debugging purposes.
Python needs to support the C runtime's _doserrno value (actually it's
a Windows error code) internally for I/O calls such as _wopen(),
close(), read(), and write().

Also, the error that you encountered, ERROR_NO_SUCH_DEVICE (433),
should be mapped to the C errno value ENOENT (i.e. FileNotFoundError)
in PC/errmap.h.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KVIYVVZCHCOYLOFQNJDNILCF7KQVBR6A/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)

2022-11-17 Thread Eryk Sun
On 11/7/22, Eryk Sun  wrote:
>
> def isjunction(path):
> """Test whether a path is a junction.
> """
> try:
> st = os.lstat(path)
> except (OSError, ValueError, AttributeError):
> return False
> return bool(st.st_reparse_tag & stat.IO_REPARSE_TAG_MOUNT_POINT)

The bitwise AND check in the above is wrong. It should check whether
the tag *equals* IO_REPARSE_TAG_MOUNT_POINT. Sorry, this was an
editing mistake when I simplified the expression to remove a redundant
check of st_file_attributes.

This idea is being developed for Python 3.12:

https://github.com/python/cpython/issues/99547
https://github.com/python/cpython/pull/99548
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GEUZJE2Q2ACFGSMPHWPO5437CXRRNAZ3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)

2022-11-08 Thread Eryk Sun
On 11/8/22, Charles Machalow  wrote:
>
> Funny enough in PowerShell, for prints an "l" for both symlinks and
> junctions.. so it kind of thinks of it as a link of some sort too I guess.

As does Python already in many cases. For example, os.lstat() doesn't
traverse a mount point (junction). On Windows, symlinks and mount
points are in a general category of name-surrogate reparse points.
os.lstat() doesn't traverse them.

If Python supported copying a mount point via
os.symlink(os.readlink(src), dst), I'd be reluctantly in favor of just
letting ntpath.islink() return true for a mount point, as a practical
measure for seamless cross-platform implementations of functions like
rmtree() and copytree(). In terms of POSIX that's nonsense, but not
really on Windows.

> Is it that much of a waste to just return False on posix? I mean it's a
> couple lines and just maintains api.. and in theory can be more clear to
> some.

I'm just thinking this through in terms of conceptual cost and
usefulness in the standard library relative to how easy it is to
implement one's own isjunction() or is_name_surrogate() test. Of
course, a lot of the os.path tests have simple implementations, such
as exists(), isdir() and isfile(). They're in the standard library
because they're commonly needed. The question is whether isjunction()
is needed enough generally to justify adding it.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G4YQTXFPDN5YQLNYUUKCP2NV4DLGWSTN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)

2022-11-07 Thread Eryk Sun
On 11/8/22, Charles Machalow  wrote:
> I tend to prefer adding isjunction instead of changing ismount since I tend
> to not think about junctions as being mounts (but closer to symlinks)..

Junctions are mount points that are similar to Unix bind mounts where
it counts -- in the behavior that's implemented for them in the
kernel. This behavior isn't exclusive to just volume mount points.
It's implemented the same for all junctions, and it's distinctly
different from symlinks.

There are times that I want to handle non-root mount points as if
they're symlinks, such as deleting them in rmtree(). There are times
where I want to handle them distinctly from symlinks, such as adding
code in copytree() to copy a junction.

> I guess either way the closeness of the concepts is a different story than
> the specific ask here. In other words: for clarity, adding a specific
> method makes the most sense to me.

Adding a posixpath.isjunction() function that's always false seems a
waste compared to common support for os.path.ismount(). On the other
hand, the realpath() call in posixpath.ismount() is expensive, so
calling os.path.ismount() to decide how to handle a directory would be
expensive on POSIX.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KC5UNZRMTL6AUYOLJG7A4VV2LIJAVN6V/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)

2022-11-07 Thread Eryk Sun
On 11/7/22, Charles Machalow  wrote:
> So would you be for specific methods to check if a given path is a
> junction?

I'd prefer for ismount() to be modified to always return true for a
junction. This would be a significant rewrite of the current
implementation, which is only true for a junction that targets a
system volume mount point (i.e. "\\?\Volume{GUID}\"). Of course
ismount() wouldn't be true for only junctions. It's also be true for
the root path of any drive, device, or UNC share if it's an existing
filesystem directory.

Implementing a function that checks for only a junction is simple
enough. For example:

def isjunction(path):
"""Test whether a path is a junction.
"""
try:
st = os.lstat(path)
except (OSError, ValueError, AttributeError):
return False
return bool(st.st_reparse_tag & stat.IO_REPARSE_TAG_MOUNT_POINT)

To be completely certain, sometimes st_file_attributes is also checked
for stat.FILE_ATTRIBUTE_REPARSE_POINT. But a filesystem that sets a
reparse point on a directory without also setting the latter file
attribute would be dysfunctional.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SE6ZHNRQ44D72ZPVCGTLNFSKVX5SAGXP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)

2022-11-07 Thread Eryk Sun
On 11/7/22, Charles Machalow  wrote:
>
> Junctions are contextually similar to symlinks on Windows.

Junctions (i.e. IO_REPARSE_TAG_MOUNT_POINT) are implemented to behave
as mount points for local volumes, so there are a couple of important
differences.

In a remote path, a junction gets resolved on the server side, which
is always possible because the target of a junction must be a local
volume (i.e. local to the server). Thus a junction that targets
"C:\spam" resolves to the "C:" drive on the remote system. If you're
resolving a junction manually via `os.readlink()`, take care to never
resolve a remote junction target as a local path such as "C:\spam".
That would not only be wrong but also potentially harmful if client
files get mistakenly modified, replaced, or deleted. On the other
hand, a remote symlink that targets "C:\spam" gets resolved by the
client and thus always resolves to the local "C:" drive of the client.
This depends on the client system allowing remote-to-local (R2L)
symlinks, which is disabled by default for good reason. When resolving
a symlink manually, at worst you'll be in violation of the system's
L2L, L2R, R2L, or R2R symlink policy.

Secondly, the target of a junction does not replace the previously
traversed path when the system parses a path. This affects how a
relative symlink gets resolved, in which case traversed junctions
behave like Unix bind mount points. Say that "E:\eggs\spamlink" is a
relative symlink that targets "..\spam". When accessed directly, this
symbolic link resolves to "E:\spam".  Say that "C:\mount\junction"
targets "E:\eggs". Then "C:\mount\junction\spamlink" resolves to
"C:\mount\spam", a different file in this case. In contrast, the
target of a symlink always replaces the traversed path when the system
parse a path. Say that "C:\mount\symlink" targets "E:\eggs". Then
"C:\mount\symlink\spamlink" resolves to "E:\spam", the same as if
"E:\eggs\spamlink" had been opened directly.

> Currently is_symlink/islink return False for junctions.

Some API contexts, libraries, and applications only support
IO_REPARSE_POINT_SYMLINK reparse points as symlinks. For general
compatibility that's the only type of reparse point that reliably
counts as a "symlink".

Also, part of the rationale for this division is that currently we
cannot copy a junction via os.readlink() and os.symlink(). If we were
to copy a junction as a symlink, in general this could change how the
target path is resolved or how the link behaves in the context of
relative symlinks.

It would be less of an issue if os.readlink() returned an object type
that allowed duplicating any name-surrogate reparse point via
os.symlink(). Instead of calling WinAPI CreateSymbolicLinkW() in such
cases, os.symlink() would create the target file/directory and
directly set the reparse point via FSCTL_SET_REPARSE_POINT.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YM6EUE6HSH7QJISUXH3J24C4OSAN7JLR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add copy to pathlib

2022-10-18 Thread Eryk Sun
On 10/18/22, Todd  wrote:
>
> So I think it would make a lot of sense to include copying inside pathlib.
> I propose adding a `copy` method to `pathlib.Path` (for concrete paths).
>
> The specific call signature would be:
>
> copy(dst, *, follow_symlinks=True, recursive=True, dir_exist_ok=True)
>
> This will call `shutil.copytree` for directories if recursive is True, or
> `copy2` if recursive if False. For files it will call `copy2` always.

FYI, Barney Gale also proposed implementing copy() and copytree()
methods recently. Barney is working on a significant restructuring of
pathlib.

https://discuss.python.org/t/incrementally-move-high-level-path-operations-from-shutil-to-pathlib/19208
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VLZ52HC6625KYESUHP6UNLUAD4FIXZC4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add copy to pathlib

2022-10-18 Thread Eryk Sun
On 10/18/22, Todd  wrote:
>
> How is it any less of a "path operation" than moving files, reading and
> writing files, making directories, and deleting files?

Path-related operations involve creating, linking, symlinking, and
listing directories and files, and peripherally also accessing file
metadata such as size, timestamps, attributes, and permissions (i.e.
filesystem indexing and bookkeeping). Reading and writing are I/O data
operations on the contents of files.

Copying a file is a path operation in that a new file gets created in
the filesystem, but it's primarily an I/O operation, as are the
read_text(), read_bytes(), write_text() and write_bytes() methods of
Path objects. The ship sailed a long time ago. Path objects support
I/O.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PPKSABYX2XUDNFJTNBJWBFFBPFFJJEDP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Use 'bin' in virtual environments on Windows

2022-07-24 Thread Eryk Sun
On 7/24/22, Barry Scott  wrote:
>
>> On 21 Jul 2022, at 16:42, Christopher Barker  wrote:
>
>> However, I’m no Windows expert, but I *think* the modern Windows file
>> system(s?) support something like symlinks. It’s an under-the-hood
>> feature, but maybe it’s possible to add a symlink for bin.
>
> It has symlinks but only available if you are administrator.

Creating symlinks requires the filesystem to support NT reparse
points. That's guaranteed for the system volume, which must be NTFS,
but it's unreliable when development is spread across various
filesystems. This is the main obstacle to relying on a "Scripts" ->
"bin" link.

It's not technically correct to state that creating symlinks requires
administrator access. It requires SeCreateSymbolicLinkPrivilege, or no
privilege at all if developer mode is enabled for the system in
Windows 10+. By default this privilege is granted to just the
administrators group. However, an administrator can grant it to any
user or group. I prefer to grant it to the "Authenticated Users"
group.

If creating a directory symlink isn't allowed, and the filesystem
supports reparse points, then a junction mount point can be created
instead. In Unix terms, this is like using a bind mount instead of a
symlink. In Windows, creating a mount point doesn't require any
privilege or special access. (Registering it with the mount-point
manager requires administrator access, but that's only done for volume
mount points, as created by SetVolumeMountPointW.)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AS7REUDGBW4QGQPXTNCMGMC4L2ZSUUXI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: TextIOBase: Make tell() and seek() pythonic

2022-05-26 Thread Eryk Sun
On 5/26/22, Steven D'Aprano  wrote:
>
> If you seek() to position 4, say, the results will be unpredictable but
> probably not anything good.
>
> In other words, the tell() and seek() cookies represent file positions
> in **bytes**, even though we are reading or writing a text file.

To clarify the general context, text I/O tell() and seek() cookies
aren't necessarily just a byte offset. They can be packed integers
that include a start position, decoder flags, a number of bytes to be
fed into the decoder, whether the decode operation should be final
(EOF), and the number of decoded characters (ordinals) to skip.  For
example:

>>> open('spam.txt', 'w', encoding='utf-7').write('\u0100'*4)
4
>>> f = open('spam.txt', encoding='utf-7')
>>> f.read(2)
'ĀĀ'
>>> f.tell()
68056473487184303961218560357960280

>>> start_pos, dec_flags, bytes_to_feed, need_eof, chars_to_skip = (
... _pyio.TextIOWrapper._unpack_cookie(..., f.tell()))
>>> start_pos, dec_flags, bytes_to_feed, need_eof, chars_to_skip
(0, 55834574848, 2, False, 0)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NLU47DADXVIPBJLGP4IPLPKYBWH7DN7F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: TextIOBase: Make tell() and seek() pythonic

2022-05-26 Thread Eryk Sun
On 5/26/22, Christopher Barker  wrote:
> IIRC, there were two builds- 16 and 32 bit Unicode. But it wasn’t UTF16, it
> was UCS-2.

In the old implementation prior to 3.3, narrow and wide builds were
supported regardless of the size of wchar_t. For a narrow build, if
wchar_t was 32-bit, then PyUnicode_FromWideChar() would encode non-BMP
ordinals as UTF-16 surrogate pairs, and PyUnicode_AsWideChar()
implemented the reverse, from UTF-16 back to UTF-32. There were
several similar cases, such as PyUnicode_FromOrdinal().

The header called this "limited" UTF-16 support, primarily I suppose
because the length of strings and indexing failed to account for
surrogate pairs. For example:

>>> s = '\U0001'
>>> len(s)
2
>>> s[0]
'\ud800'
>>> s[1]
'\udc00'

Here's a link to the old implementation:

https://github.com/python/cpython/blob/v3.2.6/Objects/unicodeobject.c
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ATPNS7CEQUONIWDXFCQEEUUGJBOJV72L/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Custom literals, a la C++

2022-04-11 Thread Eryk Sun
On 4/11/22, Chris Angelico  wrote:
>
> Which raises the question: what if the current directory no longer has
> a path name? Or is that simply not possible on Windows?

The process working directory is opened without FILE_SHARE_DELETE
sharing. This prevents opening the directory with DELETE access from
any security context in user mode, even by the SYSTEM account.

If the handle for the working directory is forcefully closed (e.g. via
Process Explorer) and the directory is deleted, then accessing a
relative path in the affected process fails with ERROR_INVALID_HANDLE
(6) until the working directory is changed to a valid directory.

> (Don't even get me started on prefixing paths with \\?\ and what that
> changes. Windows has bizarre backward compatibility constraints.)

Paths prefixed by \\?\ or \\.\ are not supported for the process
working directory and should not be used in this case. The Windows API
is buggy if the working directory is set to a prefixed path. For
example, it fails to identify a drive such as r"\\?\C:" or
r"\\?\UNC\server\share" in the working directory, in which case a
rooted path such as r"\spam" can't be accessed.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5A4RYPI6T7FHGRP7KOEL2ISQHHNUPLCJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Custom literals, a la C++

2022-04-11 Thread Eryk Sun
On 4/11/22, Chris Angelico  wrote:
>
> If you say `open("/spam")`, Windows uses "default drive" + "explicit
> directory".

You can think of a default drive as being the drive of the current
working directory, but there is no "default drive" per se that's
stored separate from the working directory.

Python and most other filesystem libraries generalize a UNC
"\\server\share" path as a 'drive', in addition to drive-letter drives
such as "Z:". However, the working directory is only remembered
separately from the process working directory in the case of
drive-letter drives, not UNC shares.

If the working directory is r"\\server\share\foo\bar", then r"\spam"
resolves to r"\\server\share\spam".

If the working directory is r"\\server\share\foo\bar", then "spam"
resolves to r"\\server\share\foo\bar\spam". However, the system will
actually access this path relative to an open handle for the working
directory.

A handle for the process working directory is always kept open and
thus protected from being renamed or deleted. Per-drive working
directories are not kept open. They're just stored as path names in
reserved environment variables.

> Hence there are 26 current directories (one per drive), plus the
> selection of current drive, which effectively chooses your current
> directory.

If the process working directory is a DOS drive path, then 26 working
directories are possible. If the process working directory is a UNC
path, then 27 working directories are possible.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OR65GYLNYOV4LT3ZEM3YFIVHSOP3D664/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Custom literals, a la C++

2022-04-11 Thread Eryk Sun
On 4/11/22, Steven D'Aprano  wrote:
>
> How does that work in practice? In Windows, if you just say the
> equivalent to `open('spam')`, how does the OS know which drive
> and WD to use?

"spam" is resolved against the process working directory, which could
be a UNC path instead of a drive. OTOH, "Z:spam" is relative to the
working directory on drive "Z:". If the latter is r"Z:\foo\bar", then
"Z:spam" resolves to r"Z:\foo\bar\spam".

The working directory on a drive gets set via os.chdir() when the
process working directory is set to a path on the drive. It's
implemented via reserved environment variables with names that begin
with "=", such as "=Z:" set to r"Z:\foo\bar". Python's os.environ
doesn't support getting or setting these variables, but WinAPI
GetEnvironmentVariableW() and SetEnvironmentVariableW() do.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ULS4MZZNF6MIUEGGRF5GIJ2PSJJOUGYL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Custom literals, a la C++

2022-04-11 Thread Eryk Sun
On 4/11/22, Steven D'Aprano  wrote:
>
> You know how every OS process has its own working directory? Just like
> that, except every module.

A per-thread working directory makes more sense to me. But it would be
a lot of work to implement support for this in the os and io modules,
for very little gain.

> "One WD per process" is baked so deep into file I/O on Posix
> systems (and I presume Windows) that its probably impossible to
> implement in current systems.

Windows has up to 27 working directories per process. There's the
overall working directory directory, plus one for each drive.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IJFHA3HTOHEANOXD34KSK7TYDHZYULWA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Missing expandvars equivalent in pathlib

2022-02-13 Thread Eryk Sun
On 2/13/22, Eric Fahlgren  wrote:
>
> That may or may not work as Windows has inconsistent treatment of multiple
> separators depending on where they appear in a path.  If TEMP is a drive
> spec, say "t:\", then it expands to "t:\\spam.csv", which is an invalid
> windows path.  If TEMP is a directory spec, "c:\temp\", then it expands to
> "c:\temp\\spam.csv", which works fine.
>
> C:\> dir c:\\temp\junk
> The filename, directory name, or volume label syntax is incorrect.

"c:\\temp\junk" isn't always invalid in CMD, and definitely not in the
Windows API. The problem occurs because the DIR command in the CMD
shell has legacy support to ignore the drive (e.g. "C:") when the root
of the path is exactly two backslashes -- because DOS in the 1980s
(i.e. they went out of their to add this behavior in CMD to make it
compatible with DOS).

To see this, check the "C$" administrative share on "localhost":

C:\>dir /b C:\\localhost\C$\Temp\spam.txt
File Not Found
C:\>echo spam >C:\\Temp\spam.txt
C:\>dir /b C:\\localhost\C$\Temp\spam.txt
spam.txt

Even though using two backslashes for the root of a drive path is
allowed in the Windows API itself, it's sill problematic. The path
part r"\\path\to\file" can't be used as relative to the current drive
of the process because it's always a UNC absolute path. So it should
be normalized to r"\path\to\file" as soon as possible, e.g. via
GetFullPathNameW():

>>> print(nt._getfullpathname(r'C:\\path\to\file'))
C:\path\to\file
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NAELNA54K5RDB23CM4MVGXRN7PBPNVYT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Missing expandvars equivalent in pathlib

2022-02-13 Thread Eryk Sun
On 2/13/22, Paul Moore  wrote:
>
> For better or worse, though, Windows (as an OS) doesn't have a "normal
> behaviour". %-expansion is a feature of CMD and .bat files, which

You're overlooking ExpandEnvironmentStringsW() [1],
ExpandEnvironmentStringsForUserW(), and PathUnExpandEnvStringsW() [2],
which provide basic support for `%` based environment variables in
strings. Python's standard library supports
winreg.ExpandEnvironmentStrings().

It is critical that the system supports this functionality in order to
evaluate REG_EXPAND_SZ values in the registry.

[1] 
https://docs.microsoft.com/en-us/windows/win32/api/processenv/nf-processenv-expandenvironmentstringsw
[2] 
https://docs.microsoft.com/en-us/windows/win32/api/shlwapi/nf-shlwapi-pathunexpandenvstringsw
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SNZVO3OAF5CZFALNQN6XIQRCJVN2NZ75/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Please consider mentioning property without setter when an attribute can't be set

2022-02-13 Thread Eryk Sun
On 2/13/22, Christopher Barker  wrote:
>
> Telling newbies that that means that it's either a property with no setter,
> or am object without a __dict__, or one with  __slots__ defined is not
> really very helpful.

The __slots__ case is due to the lack of a __dict__ slot.  It can be
manually added in __slots__ (though adding __dict__ back is uncommon),
along with the __weakref__ slot.

The exception message when there's no __dict__ is generally good
enough. For example:

>>> (1).x = None
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'int' object has no attribute 'x'

It's clear that the object has no __dict__ and no descriptor named
"x". However, the message gets confusing with partially implemented
magic attributes.

For example, implement __getattr__(), but not __setattr__() or __delattr__():

class C:
__slots__ = ()
def __getattr__(self, name):
class_name = self.__class__.__name__
if name == 'x':
return 42
raise AttributeError(f'{class_name!r} object has no '
 f'attribute {name!r}')

>>> c = C()
>>> c.x
42
>>> c.x = None
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'C' object has no attribute 'x'

Add __setattr__():

def __setattr__(self, name, value):
class_name = self.__class__.__name__
if name == 'x':
raise AttributeError(f'attribute {name!r} of {class_name!r} '
  'objects is not writable')
raise AttributeError(f'{class_name!r} object has no '
 f'attribute {name!r}')

>>> c = C()
>>> c.x = None
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 12, in __setattr__
AttributeError: attribute 'x' of 'C' objects is not writable
>>> del c.x
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'C' object has no attribute 'x'

Add __delattr__():

def __delattr__(self, name):
class_name = self.__class__.__name__
if name == 'x':
raise AttributeError(f'attribute {name!r} of {class_name!r} '
  'objects is not writable')
raise AttributeError(f'{class_name!r} object has no '
 f'attribute {name!r}')

>>> c = C()
>>> c.x
42
>>> c.x = None
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 12, in __setattr__
AttributeError: attribute 'x' of 'C' objects is not writable
>>> del c.x
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 19, in __delattr__
AttributeError: attribute 'x' of 'C' objects is not writable
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3S2KW3O7O7KKBQD2FVW6NG3CISNHF745/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Please consider mentioning property without setter when an attribute can't be set

2022-02-11 Thread Eryk Sun
On 2/11/22, Paul Moore  wrote:
>
> I'm inclined to say just raise an issue on bpo. If it's easy enough,
> it'll just get done. If it's hard, having lots of people support the
> idea won't make it any easier. I don't think this is something that
> particularly needs evidence of community support before asking for it.

The error message is in property_descr_set() in Objects/descrobject.c.
I agree that it should state that the attribute is a property. Python
developers know that a property requires a getter, setter, and deleter
method in order to function like a regular, mutable attribute. If not,
help(property) explains it all clearly.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WBQZMFZT4KCXKEUEND4BNAZFAAUA7HA2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: os.workdir() context manager

2021-09-15 Thread Eryk Sun
On 9/15/21, Paul Moore  wrote:
>
> Just a somewhat off-topic note, but dir_fd arguments are only
> supported on Unix, and the functionality only appears to be present at
> the NT Kernel level on Windows, not in the Windows API.

Handle-relative paths are supported by all NT system calls that access
object paths, but NT doesn't support ".." components. Normal user-mode
programs can make system calls directly (e.g. call NtCreateFile
instead of CreateFile), but even if Python bypassed the Windows API to
support dir_fd, the lack of support for ".." components in relative
paths would be an annoying inconsistency with POSIX dir_fd support.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QO253Y4XOMJZC7YQQRZPYME353M7WDDA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Integer concatenation to byte string

2021-03-02 Thread Eryk Sun
On 3/1/21, mmax42...@gmail.com  wrote:

> And there is no way to make a mutable bytes object without a function call.

Since a code object is immutable, the proposed bytearray display form
would still require an internal operation that constructs a bytearray
from a bytes object. For example, something like the following:

BUILD_BYTEARRAY  0
LOAD_CONST   0 (b'spam')
BYTEARRAY_EXTEND 1

> I propose an array-type string like the, or for the bytearray. It would work
> as a mutable b-string, as
>
> foo = a"\x00\x01\x02abcÿ"   # a-string, a mutable bytes object.
> foo[0] = 123  # Item assignment
> foo+= 255 # Works the same as

Concatenating a sequence with a number shouldn't be allowed. OTOH, I
think `foo += [255]` should be supported as foo.extend([255]), but
bytearray doesn't allow it currently. `foo.append(255)` is supported.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BRYZHZAVNCZYYOVSKOKLZMATASMU4WH6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-11 Thread Eryk Sun
On 2/11/21, M.-A. Lemburg  wrote:
> On 11.02.2021 13:49, Eryk Sun wrote:
>
>> Currently, locale.getpreferredencoding(False) is implemented as
>> locale._get_locale_encoding(). This ultimately calls
>> _Py_GetLocaleEncoding(), defined in "Python/fileutils.c".
>> TextIOWrapper() calls this C function to get the encoding to use when
>> encoding=None is passed.
>
> All that seems to be new in Python 3.10. This is not what's
> happening in Python 3.9. The _get_locale_encoding() function
> doesn't even exist.

In previous versions, locale.getpreferredencoding(False) is
functionally the same. In 3.10, the latter is implemented in C via
locale._get_locale_encoding().

> Why an env variable ? You could simply open up a ticket to get this
> fixed, since 3.10 is not released yet.

I thought it would be best to let users/administrators opt in to POSIX
behavior. But maybe it should require opting out.

>>>> getlocale(LC_CTYPE)
> ('en_US', 'ISO8859-1')
>>>> getlocale(LC_CTYPE)
> ('el_GR', 'ISO8859-7')

Windows code pages 1252 and 1253 are not the same as ISO-8859-1 and
ISO-8859-7. getlocale() is just looking up the encoding of "en_US" and
"el_GR" from the mapping in the locale module. That kind of best-guess
result isn't right for locale._get_locale_encoding().

> The returned values for the encoding look mostly correct to
> me, except the one for the 'C' locale which should be 'ascii'.

The "C" locale in the Windows CRT uses Latin-1 for LC_CTYPE. This is
implemented for mbstowcs() by casting from char to wchar_t. It's
similar for wcstombs(), and limited to Unicode ordinals below 256.
However, the "C" locale isn't consistently Latin-1 across other
categories. IIRC, LC_TIME in the "C" locale uses the process ANSI code
page for time-zone names, and mojibake is common.

> Anyway, UTF-8 mode is the way to go these days, esp. if you want
> to write applications which are portable across platforms and
> behave the same on all.

Globally setting PYTHONUTF8 forces all scripts to use UTF-8 as the
default for open(). I'd like to let scripts opt in to using UTF-8 as
the default for open() by way of an explicit setlocale() call such as
setlocale(LC_CTYPE, (getdefaultlocale()[0], "UTF-8")) or, Windows
only, setlocale(LC_CTYPE, ".UTF-8"). In POSIX, Python already tries
coercing the "C" and "POSIX" locales (usually ASCII) to use UTF-8.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/A6HOUXS4E2LFCSZA4RTJ3OE6ZXHRVAQF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-11 Thread Eryk Sun
On 2/11/21, M.-A. Lemburg  wrote:

> I think the main problem here is that open() doesn't use
> locale.getlocale()[1] as default for the encoding parameter,
> but instead locale.getpreferredencoding(False).

Currently, locale.getpreferredencoding(False) is implemented as
locale._get_locale_encoding(). This ultimately calls
_Py_GetLocaleEncoding(), defined in "Python/fileutils.c".
TextIOWrapper() calls this C function to get the encoding to use when
encoding=None is passed.

In POSIX, _Py_GetLocaleEncoding() calls nl_langinfo(CODESET), which
returns the current LC_CTYPE encoding, not the default LC_CTYPE
encoding. For example, in Linux:

>>> setlocale(LC_CTYPE, 'en_US.UTF-8')
'en_US.UTF-8'
>>> _get_locale_encoding()
'UTF-8'
>>> open('test.txt').encoding
'UTF-8'

>>> setlocale(LC_CTYPE, 'en_US.ISO-8859-1')
'en_US.ISO-8859-1'
>>> _get_locale_encoding()
'ISO-8859-1'
>>> open('test.txt').encoding
'ISO-8859-1'

In Windows, _Py_GetLocaleEncoding() just uses GetACP(), which returns
the process ANSI code page. This is based on the CRT's default locale
set by setlocale(LC_CTYPE, ""), which combines the user's default
locale with the process ANSI code page. I'm not overjoyed about this
combination in the default locale, since it's potentially inconsistent
(e.g. Korean user locale with Latin 1252 process code page), but that
ship sailed a long time ago. I'm not arguing to change
locale.getdefaultlocale().

The problem is that locale._get_locale_encoding() in Windows is not
returning the current LC_CTYPE locale encoding, in contrast to how it
behaves in POSIX. I'd like an environment variable and/or -X option to
fix this flaw. If enabled, and if the C runtime supports UTF-8 locales
(as it has for the past 3 years in Windows 10), and the application
warrants it (e.g. many open calls across many modules), then
convenient use of UTF-8 would be one setlocale() call away.

It's not for packages. Frankly, I don't see why it's a problem for a
package developer to use encoding='utf-8' for files that need to use
UTF-8. Developing libraries that are designed to work in arbitrary
applications on multiple platforms is tedious work. Having to
explicitly pass encoding='utf-8' goes with the territory, and it's a
minor annoyance in the grand scheme of things.

> That's what getlocale(LC_CTYPE) is intended for, unless I'm
> missing something.

getlocale() can't be relied on to parse the correct codeset from the
locale name, and it can even raise ValueError (more likely in Windows,
e.g. with the native locale name "en-US"). The codeset should be
queried directly using an API call, such as nl_langinfo(CODESET) in
POSIX.

In Windows, the C runtime's POSIX locale implementation doesn't
include nl_langinfo(). There's ___lc_codepage_func(), but it's
documented as an internal function. A ucrt locale record, however,
does expose the code page as a public field, as documented in the
public header "corecrt.h". Here's a prototype using ctypes:

import os
import ctypes

ucrt = ctypes.CDLL('ucrtbase', use_errno=True)

class _crt_locale_data_public(ctypes.Structure):
_fields_ = (('_locale_pctype', ctypes.POINTER(ctypes.c_ushort)),
('_locale_mb_cur_max', ctypes.c_int),
('_locale_lc_codepage', ctypes.c_uint))

class _crt_locale_pointers(ctypes.Structure):
_fields_ = (('locinfo', ctypes.POINTER(_crt_locale_data_public)),
('mbcinfo', ctypes.c_void_p))

ucrt._get_current_locale.restype = ctypes.POINTER(_crt_locale_pointers)

CP_UTF8 = 65001

def _get_locale_encoding():
locale = ucrt._get_current_locale()
if not locale:
errno = ctypes.get_errno()
raise OSError(errno, os.strerror(errno))
try:
codepage = locale[0].locinfo[0]._locale_lc_codepage
finally:
ucrt._free_locale(locale)
if codepage == 0:
return 'latin-1' # "C" locale
if codepage == CP_UTF8:
return 'utf-8'
return f'cp{cp}'

Examples with Python 3.9 in Windows 10:

>>> setlocale(LC_CTYPE, 'C')
'C'
>>> _get_locale_encoding()
'latin-1'
>>> setlocale(LC_CTYPE, 'en_US')
'en_US'
>>> _get_locale_encoding()
'cp1252'
>>> setlocale(LC_CTYPE, 'el_GR')
'el_GR'
>>> _get_locale_encoding()
'cp1253'
>>> setlocale(LC_CTYPE, 'en_US.utf-8')
'en_US.utf-8'
>>> _get_locale_encoding()
'utf-8'
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OQJBNUKMKH6CHGJKFM6H6SCOEIYECLSU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-10 Thread Eryk Sun
On 2/11/21, Christopher Barker  wrote:
> On Wed, Feb 10, 2021 at 12:33 AM Paul Moore  wrote:
>
>> So get PYTHONUTF8 added to the environment activate script. That's a
>> simple change to venv. And virtualenv, and conda
>
> That's probably a good solution for venv and virtualenv -- essentially add
> it as another environment creation option.

Note that using a virtual environment does not require activation. A
script can be deployed to run in a virtual environment by referring to
the environment's executable in a shebang line, e.g.:

#!path\to\venv\Scripts\python.exe

Or with a Windows shell link that runs

path\to\venv\Scripts\python.exe path\to\script.py

Setting PYTHONUTF8 in the activate script does nothing to educate
users about the default encoding in other contexts. The REPL shell
could print a short message at startup that informs the user that
Python is using UTF-8 mode, including a link to a web page that
explains this in more detail.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SN4HJZFL3CXOJS53DUTQDRQ4MCXRLERT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-10 Thread Eryk Sun
On 2/10/21, M.-A. Lemburg  wrote:
>
> setx PYTHONUTF8 1
>
> does the trick in an admin command shell on Windows globally.

The above command sets the variable only for the current user, which
I'd recommend anyway. It does not require administrator access. To set
a machine value, run `setx /M PYTHONUTF8 1`, which of course requires
administrator access. Also, run `set PYTHONUTF8=1` in CMD or
`$env:PYTHONUTF8=1` in PowerShell to set the variable in the current
shell.

Unrelated to UTF-8 mode and long-term plans to make UTF-8 the
preferred encoding, what I want, from the perspective of writing
applications and scripts (not libraries), is a -X option and/or
environment variable to make local._get_locale_encoding() behave like
it does in POSIX. It should return the LC_CTYPE codeset of the current
locale, not just the default locale. This would allow setlocale() in
Windows to change the default for encoding=None, just as it does in
POSIX. Technically it's not hard to implement in a way that's as
reliable as nl_langinfo(CODESET) in POSIX. The code page of the
current CRT locale is a public field. In Windows 10 the CRT has
supported UTF-8 for 3 years -- regardless of the process active code
page returned by GetACP(). Just call setlocale(LC_CTYPE, ".UTF-8") or
setlocale(LC_CTYPE, (getdefaultlocale()[0], 'UTF-8')).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KCCRN4T4TLOUH6GYQ3JDIPFZUUDA4QQA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-09 Thread Eryk Sun
On 2/9/21, Inada Naoki  wrote:
> On Tue, Feb 9, 2021 at 7:42 PM M.-A. Lemburg  wrote:
>
> But it affects to all Python installs. Can teachers recommend to set
> PYTHONUTF8 environment variable for students?

Users can simply create a shortcut that targets `cmd /k set
PYTHONUTF8=1`. Optionally change the shortcut's "start in" directory
to the desired working directory.

>> Here's a good blog post about setting env vars on Windows:
>>
>> https://www.dowdandassociates.com/blog/content/
>> howto-set-an-environment-variable-in-windows-command-line-and-registry/

Command-line modification of the persistent environment is rarely
required. Using setx.exe is okay for setting simple variables in CMD
[1], such as `setx PYTHONUTF8 1`, combined with `set PYTHONUTF8=1` for
the current shell.

To do this in the GUI in Windows 10, click on the start button (or tap
the WIN key) to show the start menu; type "environ"; and click on
"Edit environment variables for your account". In the window that
opens, click the "New" button; type "PYTHONUTF8" as the name and "1"
(without quotes) as the value. Click the "OK" button on the dialog,
and then click the "OK" button on the editor window.

To test the value, assuming you have the py launcher installed, press
WIN+R to open the run dialog. Type "py", and in the Python shell
confirm that executing `import locale; locale.getpreferredencoding()`
returns 'UTF-8'.

---
[1] I would feel remiss in discussing "setx.exe" without warning about
naively trying to modify PATH. For example, DO NOT execute a command
like `setx.exe PATH "C:\Program Files\Python39;%PATH%"`. This is wrong
because it sets the current PATH value, including the system part, as
the user "Path" value, truncated to 1024 characters, and without the
original dependence on system variables and independent (REG_SZ) user
variables. Properly modifying the persistent "Path" from CMD is
difficult and requires careful use of both reg.exe and setx.exe. It's
easier in PowerShell. It's far easier to use the GUI editor, which in
Windows 10 even provides an exploded list view that makes it simple to
add/remove directories and move them up and down in the list.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WOTADHRBJRTMERYNVUOW4LMW3CIKHTDQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-06 Thread Eryk Sun
On 2/6/21, Christopher Barker  wrote:
> On Sat, Feb 6, 2021 at 11:47 AM Eryk Sun  wrote:
>
>> Relative to the installation, "python.cfg" should only be found in the
>> same directory as the base executable, not its parent directory.
>
> OK, my mistake — I thought that was already the case with pyvenv.cfg.
> Though I don’t get why it matters.

Chiefly, I don't want to overload "pyvenv.cfg" with new behavior
that's unrelated to virtual environments.

I also dislike the way this file is found. If the parent directory is
"C:\Program Files", then I'm not worried about finding "C:\Program
Files\pyvenv.cfg" when the interpreter tries to open it. But this
pattern is not safe in general when installed to an arbitrary
directory, or with a portable distribution.

The presence of a "._pth" file (Windows only) beside the DLL or
executable bypasses the search for "pyvenv.cfg", among other things.
The embedded distribution includes a ._pth that locks it down. This is
another reason to use a different file to configure defaults for -X
settings such as "utf8", a file that's guaranteed to always be read.

>> Add an option in the installed "python.cfg" to set the name of the
>> organization and application.
>
> That would work for, e.g. pyinstaller (which I hope already ignores these
> kinds if configuration.
>
> But not for, e.g. web applications that expect to use virtual environments
> to isolate themselves.

The idea to use the profile data directories %ProgramData% and
%LocalAppData% was for symmetry with how this could be supported in
POSIX, which doesn't use the application directory as Windows does.

The application "python.cfg" (in the directory of the executable,
including a virtual environment) can support a setting to isolate it
from system and user "python.cfg" files.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T42D2VDMQ7JY7WYP2W3ALFHZGUYXLPZF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-06 Thread Eryk Sun
On 2/6/21, Christopher Barker  wrote:
> On Fri, Feb 5, 2021 at 12:59 PM Eryk Sun  wrote:
>
> But why limit it to that? If there are more things to configure in an
> environment-specific way — why not put it in this existing location?

I'd rather not limit the capability to just virtual environments.

> I'd prefer a new configuration file that sets the default values for
>> -X implementation-specific options. The mechanism for finding this
>> file can support virtual environments.
>
> Then wouldn’t that simply be two configuration  files that will be treated
> the same way?

Relative to the installation, "python.cfg" should only be found in the
same directory as the base executable, not its parent directory. If
"pyvenv.cfg" is found, then it's a virtual environment, and
"python.cfg" will also be looked for in the directory of "pyvenv.cfg",
and supersedes settings in the base installation.

> I’m still convinced that It is a bad idea to have User-wide Python
> configuration like this. The fact is that different Python apps (may) need
> different configurations, and environments are the way to support that.

Add an option in the installed "python.cfg" to set the name of the
organization and application. If not set, the organization and
application respectively default to "Python" and
"Python[-32]". Looking for system and user configuration
would be parameterized using that name, i.e.
"%ProgramData%\\\python.cfg" and
"%LocalAppData%\\\python.cfg".
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TXKCDQL3JNCUG52M265LU5O7USBWO7D6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-06 Thread Eryk Sun
On 2/6/21, Inada Naoki  wrote:
>
> If adding option to pyvenv.cfg is not make sense, we can add
> `python.ini` to same place pyvenv.cfg. i.e., directory containing
> python.exe, or one above directory.

I'd rather look for "python.cfg" in the directory of the base
executable (e.g. "C:\Program Files\Python310") and then in the
directory of "pyvenv.cfg", if the latter is found. I wouldn't want it
to check for "python.cfg" in the parent directory of the base
executable.

> And no need for all installations, and per-user setting.
> Environment variable is that already.

A configuration file in a profile data directory can target a
particular version, such as
"%LocalAppData%\Python\Python310-32\python.cfg". This is more flexible
for the user to override a system installation, compared to setting
PYTHONUTF8. However, it's not a major issue if you don't want to
support the extra flexibility.

That said, supporting %ProgramData% and %LocalAppData% data
directories is more consistent with how this feature would be
implemented in POSIX, such as "/etc/python3.10/python.cfg" and
"$HOME/.config/python310/python.cfg". I think that matters because
this file would be a good place to set defaults for all -X options
(e.g. "utf8", "pycache_prefix", "faulthandler").
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3QGOQTRJTHQPQ5MQ2URCKKYBKASMAEH2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-05 Thread Eryk Sun
On 2/5/21, Barry Scott  wrote:
>> On 5 Feb 2021, at 11:06, Inada Naoki  wrote:
>
>> python.exe lookup pyvenv.cfg even outside of venv.
>> So we can write utf8mode=1 in pyvenv.cfg even outside of venv.

I don't like extending "pyvenv.cfg" with generic settings. This is a
file to configure a virtual environment in terms of finding the
standard library and packages.

I'd prefer a new configuration file that sets the default values for
-X implementation-specific options. The mechanism for finding this
file can support virtual environments.

> This is the problem that I was thinking about when I proposed using
> a py.ini like solution where the file is looked for in the users config
> folder. I think that is the %LOCALAPPDATA% folder for py.exe.

It is standard practice and recommended to create a directory for the
organization or project and optionally a child directory for each
application, such as "%ProgramData%\Python\Python38-32\python.ini" and
"%LocalAppData%\Python\Python38-32\python.ini".

I would have preferred for the py launcher to read and merge settings
for all existing configuration files in the order of
"%ProgramData%\Python\py.ini" (all installations),
"%__AppDir__%\py.ini" (particular installation), and
"%LocalAppData%\Python\py.ini" (user).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2W6V2WURBTGEXOE7CH4B73IMMGUNHY3W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add a couple of options to open()'s mode parameter to deal with common text encodings

2021-02-04 Thread Eryk Sun
On 2/4/21, Ben Rudiak-Gould  wrote:
>
> My proposal is to add a couple of single-character options to open()'s mode
> parameter. 'b' and 't' already exist, and the encoding parameter
> essentially selects subcategories of 't', but it's annoyingly verbose and
> so people often omit it.
>
> If '8' was equivalent to specifying encoding='UTF-8', and 'L' was
> equivalent to specifying encoding=(the real locale encoding, ignoring UTF-8
> mode), that would go a long way toward making open more convenient in the
> common cases on Windows, and I bet it would encourage at least some of
> those developing on Unixy platforms to write more portable code also.

A precedent for using the mode parameter is [_w]fopen in MSVC, which
supports a "ccs=" flag, where "" can be "UTF-8",
"UTF-16LE", or "UNICODE".

---

In terms of using the 'locale', keep in mind that the implementation
in Windows doesn't use the current LC_CTYPE locale. It only uses the
default locale, which in turn uses the process active (ANSI) code
page. The latter is a system setting, unless overridden to UTF-8 in
the application manifest (e.g. the manifest that's embedded in
"python.exe").

I'd like to see support for a -X option and/or environment variable to
make Python in Windows actually use the current locale to get the
locale encoding (a real shocker, I know). For example,
setlocale(LC_CTYPE, "el_GR") would select "cp1253" (Greek) as the
locale encoding, while setlocale(LC_CTYPE, "el_GR.utf-8") would select
"utf-8" as the locale encoding.

(The CRT supports UTF-8 in locales starting with Windows 10, build
17134, released on 2018-04-03.)

At startup, Python 3.8+ calls setlocale(LC_CTYPE, "") to use the
default locale, for use with C functions such as mbstowcs(). This
allows the default behavior to remain the same, unless the new option
also entails attempting locale coercion to UTF-8 via
setlocale(LC_CTYPE, ".utf-8").

The following gets the current locale's code page in C:

#include <"locale.h">
// ...
loc = _get_current_locale();
locinfo = (__crt_locale_data_public *)loc->locinfo;
cp = locinfo->_locale_lc_codepage;

The "C" locale uses code page 0. C mbstowcs() and wcstombs() handle
this case as Latin-1. locale._get_locale_encoding() could instead map
it to the process ANSI code page, GetACP(). Also, the CRT displays
CP_UTF8 (65001) as "utf8". _get_locale_encoding() should map it to
"utf-8" instead of "cp65001".
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MZC4DDCTMOX25ZQVUGBNLE6VPVXHXNKU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.

2021-02-03 Thread Eryk Sun
On 2/2/21, Christopher Barker  wrote:
>
> In the common case, folks have their environment variables set in an
> initialization file (or the registry? I've lost track of what Windows does
> these days)

It hasn't fundamentally changed since the mid 1990s. Configurable
system variables are set in the regsitry key
"HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Environment",
and configurable user variables are set in "HKCU\Environment".

A process is spawned with an environment that's sourced from the
parent process. Either it's inherited from the parent's environment or
it's a new environment that was passed to CreateProcessW().

The ancestor of most interactive processes in a desktop session is the
graphical shell, Explorer. At startup, it calls an undocumented
shell32 function (RegenerateUserEnvironment) to load a new environment
from scratch. It also reloads its environment in response to a
WM_SETTINGCHANGE "Environment" message.

The documented way to reload the environment from scratch is
CreateEnvironmentBlock(, htoken, FALSE) and
SetEnvironmentStringsW(env).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZKIGW3CIM7GGRAVTXL2V44XZ4Q7GXG7Q/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Provide UTF-8 version of Python for Windows.

2021-01-25 Thread Eryk Sun
On 1/26/21, Eryk Sun  wrote:
>
> The process active code page for GetACP() and GetOEMCP() is changed to
> UTF-8 (65001). The C runtime also overrides the user locale to UTF-8
> if GetACP() returns UTF-8, i.e. setlocale(LC_CTYPE, "") will return
> "utf8" as the encoding.

One concern is what to do for the special "ansi" and "oem" encodings.
If scripts rely on them for IPC, such as with subprocess.Popen(), then
it could be frustrating if they're just synonyms for UTF-8 (code page
65001). I've tested that it's possible for Python to peg "ansi" and
"oem" to the system ANSI and OEM code pages via GetLocaleInfoEx() with
LOCALE_NAME_SYSTEM_DEFAULT and the LCType constants
LOCALE_IDEFAULTANSICODEPAGE and LOCALE_IDEFAULTCODEPAGE (OEM). But
then they're no longer accurate within the current process, for which
ANSI and OEM are UTF-8.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5RCA3LVRBWVAHGDRGMR5RVAGP647NGDJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Provide UTF-8 version of Python for Windows.

2021-01-25 Thread Eryk Sun
On 1/25/21, Inada Naoki  wrote:
>
> Microsoft provides UTF-8 code page for process. It can be enabled by
> manifest file.
>
> How about providing Python binaris both of "UTF-8 version" and "ANSI
> version"?

I experimented with this manifest setting several months ago. To try
it out, simply export the manifest from "python.exe", edit it to add
the "activeCodePage" setting, and then replace it in "python.exe".

The process active code page for GetACP() and GetOEMCP() is changed to
UTF-8 (65001). The C runtime also overrides the user locale to UTF-8
if GetACP() returns UTF-8, i.e. setlocale(LC_CTYPE, "") will return
"utf8" as the encoding.

The console is hosted in a separate conhost.exe or openconsole.exe
process, so it still defaults to the system OEM code page for its
input and output code pages. This pertains only to low-level os.read()
and os.write(). High-level console I/O uses io._WindowsConsoleIO for
console files, which is internally UTF-16 and outwardly UTF-8.

> * Windows team needs to maintain more versions.

I suppose the installer could install both sets of binaries, and copy
to "python[w][_d].exe" based on an installer option. But then the
UTF-8 selection statistics wouldn't be tracked, unless the installer
phones home.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/76TJ4CMMR2FXQGMKWOQCSBGVBG5DSN3K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: built in to clear terminal

2020-12-22 Thread Eryk Sun
On 12/22/20, David Mertz  wrote:
> On Tue, Dec 22, 2020 at 10:26 PM Chris Angelico  wrote:
>
> I'm not sure about Windows. Is 'cls' built into the command-line executable
> itself (like Busybox) or is it an exe?

CLS is an internal command of the CMD shell. An internal command takes
precedence as long as the executed name is unquoted and has no
extension.

My concern with cls/clear is that they clear the scrollback. Is that
what most people want from a clear_screen() function?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PIVBSNSW5T7XTLVSYKM3PBEYUG2ROFMM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: built in to clear terminal

2020-12-22 Thread Eryk Sun
On 12/22/20, Barry Scott  wrote:
>
> import sys
>
> def clear_terminal():
> if sys.platform == 'win32':
> import ctypes
> kernel32 = ctypes.windll.kernel32
> # turn on the console ANSI colour handling
> kernel32.SetConsoleMode(kernel32.GetStdHandle(-11), 7)
>
> sys.stdout.write('\x1b[2J' '\x1b[H')

Here are some concerns I have:

* Does not support Windows 8
* Does not support legacy console in Windows 10 (on the "options" tab)
* Does not check for SetConsoleMode failure
* Does not support a different active screen buffer
* Assumes StandardOutput is a screen buffer for the current console
* Assumes the current mode of the screen buffer is 3 or 7. New modes
have been added, and even more may be added
* Sets a global console setting that persists after Python exits

Like the CRT's "conio" API, clear_screen() should open "conout$"
(temporarily), which will succeed if Python is attached to a console,
regardless of the state of the standard handles, file descriptors, or
sys.stdout, and will always open the currently active screen buffer,
regardless of how many screen buffers exist in the current console
session.

The current mode should be queried for
ENABLE_VIRTUAL_TERMINAL_PROCESSING (4) via GetConsoleMode(). If it's
not enabled, bitwise OR it into the mode and try to enable it via
SetConsoleMode(). If VT mode is enabled, write '\x1b[2J\x1b[H' to the
file.

If VT mode can't be enabled, then fall back on the legacy console API.
In particular, some people mentioned not wanting to spawn a cmd.exe
process just to use its CLS command. Even if spawning a process is
okay, the CLS command clears the scrollback, which is inconsistent
with ESC[2J. If clear_screen() is going to add ESC[3J to clear the
scrollback, then it's at least consistent, but I'd rather not clear
the scrollback.

clear_screen() should be able to emulate ESC[2J via
GetConsoleScreenBufferInfoEx (get the screen buffer size, window,
cursor position, and default character attributes),
ScrollConsoleScreenBuffer (if the screen buffer has to be scrolled up
to make space), and SetConsoleScreenBufferInfoEx (shift the visible
window in the buffer and set the cursor position). This can be
implemented in ctypes or C. But normally the standard library avoids
using ctypes.

Finally, if VT mode was temporarily enabled, revert to the original
mode, and always close the "conout$" file.

Off topic comment:

> kernel32 = ctypes.windll.kernel32

I recommend the following instead:

kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

The global library loaders such as ctypes.cdll and ctypes.windll are
not reliable for production code in the wild. They cache CDLL library
instances, which cache function pointers, which may have argtypes,
restype, and errcheck prototypes. Another imported package might set
function prototypes that break your code, which is an actual problem
that I've seen a few times, particularly with common routines from
kernel32, advapi32, and user32. It's not worth taking the chance of a
conflict with another package just to save a few keystrokes.

The global loaders also don't allow setting use_errno=True or
use_last_error=True, so the function pointers they create don't
capture the C errno value for ctypes.get_errno() or Windows last error
value for ctypes.get_last_error(). Calling kernel32.GetLastError()
after the fact may not be reliable in a scripting environment even if
it's called directly after the previous FFI call.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AHDEAUNZNUT6EWT7GTEGSKKFL3GABZ4W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: built in to clear terminal

2020-12-20 Thread Eryk Sun
On 12/20/20, Cameron Simpson  wrote:
> On 20Dec2020 15:48, Christopher Barker  wrote:
>
>>That would be great, though I just looked at the 3.9 docs and saw:
>>"The Windows version of Python doesn’t include the curses module."
>
> Yeah, Windows.

A C or ctypes implementation is required in Windows. Virtual terminal
mode is supported in Windows 10, for which it *might* be enabled. This
can be queried via GetConsoleMode. If virtual terminal mode isn't
enabled, then clearing the screen has to be implemented by scrolling
the console screen buffer. The screen buffer size, visible rectangle,
and current character attributes can be queried via
GetConsoleScreenBufferInfo. The window can be scrolled via
ScrollConsoleScreenBuffer. The entire buffer can be scrolled out, like
the CMD shell's CLS command, or one can just scroll the buffer enough
to clear the visible window. The cursor can be set to the home
position via SetConsoleCursorPosition.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MRU5G2IB4CBSK5TAWKHS7EXIV6ECBEKO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Global flag for whether a module is __main__

2020-11-12 Thread Eryk Sun
On 11/12/20, Chris Angelico  wrote:
>
> I actually don't use the "if name is main" idiom all that often. The
> need to have a script be both a module and an executable is less
> important than you might think. In a huge number of cases, it's
> actually better to separate out the library-like and script-like
> portions into separate files, or some other reorganization.

One caveat is that the module __name__ check is required for
multiprocessing in spawn mode (as opposed to fork mode), which is the
only supported mode in Windows.

Generally, I think scripts in installed packages are better handled
via setuptools entrypoints nowadays. For cross-platform support, pip
automatically does the right thing for Windows by creating executable
.exe launchers. The issue is two-fold: the default action of the .py
file association is often configured to edit rather than execute
scripts, and, irrespective of the latter, many execution paths, such
as subprocess.Popen, use CreateProcess from the base API, which does
not support file associations. (File associations are the closest
Windows has to Unix shebangs, and the basis for how the py.exe
launcher supports Unix shebangs in scripts, but they're implemented in
the high-level shell API instead of the base API.)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ENI3R2KWR7Q2V4POOWJLLHMBRKLWMSNZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: New feature

2020-10-18 Thread Eryk Sun
On 10/18/20, Mike Miller  wrote:
> On 2020-10-17 17:05, Eryk Sun wrote:
>> CMD's CLS is implemented with three API calls:
>> GetConsoleScreenBufferInfo to get the screen-buffer dimensions and
>> default text attributes, ScrollConsoleScreenBufferW to shift the
>> buffer out, and SetConsoleCursorPosition to move the cursor to (0,0).
>
> Would you happen to have a link to some Python/ctypes code to implement
> this?

I would expect this to be implemented in posixmodule.c, not via
ctypes. I can help with the implementation in C. Read the following
pages in the console docs, if you haven't already:

https://docs.microsoft.com/en-us/windows/console/scrolling-the-screen-buffer
https://docs.microsoft.com/en-us/windows/console/scrolling-a-screen-buffer-s-contents
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZN7UFUZJASVZRXP5NVMM3BXTTYRGDMXW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: New feature

2020-10-18 Thread Eryk Sun
On 10/18/20, Mike Miller  wrote:
>
> Also, a shell is not a terminal, so terminal routines don't feel right in
> shutil.  Putting get_terminal_size() there was a mistake imho.

The shutil module "offers a number of high-level operations on files".
ISTM that shutil.get_terminal_size is a high-level operation on
sys.__stdout__, if it's a terminal/console device file, though it's an
odd duck since the rest of the module is dealing with filesystem
files. That said, rightly or wrongly, I think of shutil as a
collection of shell utility (SHell UTILity) functions for Python's
standard library, so I'm comfortable with expanding its mandate to
functions commonly supported by CLI shell environments, such as
terminal/console management.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/237RUHNA6IFRXPBBWHH3QHIEUNO77MRG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: New feature

2020-10-18 Thread Eryk Sun
On 10/17/20, Christopher Barker  wrote:
>
> then how about os.clear_terminal() ?

IMO, an os level function such as os.clear_terminal(fd) should only
support terminal/console devices and would be implemented in
Modules/posixmodule.c. Higher-level behavior and support for IDEs
belongs in shutil.

> Sure, there's a manageable set of default terminals across the major OSs
> (and lInux Desktops), but there are a LOT of others as well, including
> IDEs, and even the new Terminal in Windows:

I would expect os.clear_terminal() to make exceptions only for popular
terminals/consoles, if they don't support the common ANSI sequence to
clear the screen.

In Windows 10, you can enable virtual terminal (VT) mode by default
for all non-legacy console sessions by setting "VirtualTerminalLevel"
to 1 in "HKCU\Console". VT mode supports the standard ANSI sequences
for clearing the terminal and/or scrollback. Regardless of the
VirtualTerminalLevel setting, each tab in Windows Terminal is a
headless pseudoconsole session (ConPTY) that has VT mode enabled by
default.

In all supported versions of Windows, if VT mode is disabled or not
supported, as determined by GetConsoleMode, then the console screen
buffer can be scrolled or cleared via GetConsoleScreenBufferInfo and
ScrollConsoleScreenBuffer, and the cursor can be reset to (0,0) via
SetConsoleCursorPosition.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7TNXGFFPEUUGX7GTL4S4JRV4Q42EJTLL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: New feature

2020-10-17 Thread Eryk Sun
On 10/16/20, Rob Cliffe via Python-ideas  wrote:
>
> May I suggest it be called os.clearscreen()?

I'd prefer shutil.clear_screen(). There's already
shutil.get_terminal_size(). I know there's also
os.get_terminal_size(), but its use isn't encouraged.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RSYOV4KUQOFKUUQETFRXPWJC64RV47RN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: New feature

2020-10-17 Thread Eryk Sun
On 10/13/20, Mike Miller  wrote:
>
> The legacy Windows console has another limitation in that I don't believe it
> has a single API call to clear the whole thing.  One must iterate over the 
> whole
> buffer and write spaces to each cell, or some similar craziness.

No, it's not really similar craziness -- at least not from the client
program's perspective. The implementation in the console host itself
is probably something like that.

CMD's CLS is implemented with three API calls:
GetConsoleScreenBufferInfo to get the screen-buffer dimensions and
default text attributes, ScrollConsoleScreenBufferW to shift the
buffer out, and SetConsoleCursorPosition to move the cursor to (0,0).

https://docs.microsoft.com/en-us/windows/console/scrollconsolescreenbuffer

The following debugger session is while stepped into CMD's eCls()
function that implements the CLS command. This is just before it calls
ScrollConsoleScreenBufferW, with parameters 1-4 in registers rcx, rdx,
r8, and r9, and parameter 5 on the stack.

lpScrollRectangle (rdx): the entire screen buffer (sized 125 x 9001)
is to be scrolled.

0:000> ?? ((SMALL_RECT *)@rdx)
struct _SMALL_RECT * 0x00e9`be8ff8b8
   +0x000 Left : 0n0
   +0x002 Top  : 0n0
   +0x004 Right: 0n125
   +0x006 Bottom   : 0n9001

dwDestinationOrigin (r9): the target row is -9001, so the contents of
the entire buffer are shifted out.

0:000> ?? (short)(@r9 >> 16)
short 0n-9001

lpFill (rsp / stack): use a space with the default attributes (in my
case background color 0 and foreground color 7, in the current
16-color palette).

0:000> ?? ((CHAR_INFO **)@rsp)[4]->Char.UnicodeChar
wchar_t 0x20 ' '
0:000> ?? ((CHAR_INFO **)@rsp)[4]->Attributes
unsigned short 7
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SYMJ4HM6ZUBA3HMS5QXIDVSMQDRECHFP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: New feature

2020-10-17 Thread Eryk Sun
On 10/16/20, Steven D'Aprano  wrote:
>
> On terminals that support it, this should work:
>
> - `print('\33[H\33[2J')`
>
> but I have no idea how to avoid clearing the scrollback buffer on
> Windows, or other posix systems with unusual terminals.

In Windows 10, ANSI sequences and some C1 control characters (e.g.
clear via CSI -- '\x9b2J\x9bH') are supported by a console session if
it's not in legacy mode. The ESC character can be typed as Ctrl+[,
which is useful in the CMD shell, which doesn't support character
escapes such as \33 or \x1b. It can also be set in an %ESC%
environment variable.

Using ANSI sequences and C1 controls requires virtual terminal (VT)
mode to be enabled for the console screen buffer. VT mode is enabled
by default in a pseudoconsole session (e.g. when attached to a tab in
Windows Terminal), but it can be manually disabled. It's also enabled
by default for non-legacy console sessions if "VirtualTerminalLevel"
is set to 1 in "HKCU\Console". Regardless, it's simple to check
whether VT mode is currently enabled for the screen buffer via WinAPI
GetConsoleMode.

If VT mode isn't enabled, the screen buffer can be scrolled using the
console API function ScrollConsoleScreenBuffer using dimensions and
attributes from GetConsoleScreenBufferInfo, and the cursor position
can be set via SetConsoleCursorPosition.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2JOPCG55LD6I7S6673C3BNTH2EDSLSWH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: New feature

2020-10-16 Thread Eryk Sun
On 10/16/20, Barry Scott  wrote:
>
> I find that you have to do this to turn on ANSI processing in CMD.EXE on
> Window 10 and I assume earlier Windwows as wel:

You mean the console-session host (conhost.exe). This has nothing to
do with the CMD shell. People often confuse CLI shells (CMD,
PowerShell, bash) with the console/terminal that they use for standard
I/O.

Virtual Terminal mode is supported by the new console in Windows 10 --
not in earlier versions of Windows and not with the legacy console in
Windows 10. If you need to support ANSI sequences with the legacy
console host, consider using a third-party library such as colorama.

You can enable VT mode by default for regular console sessions (i.e.
not headless sessions such as under Windows Terminal, for which it's
always enabled) by setting a DWORD value of 1 named
"VirtualTerminalLevel" in the registry key "HKCU\Console".

> import ctypes
> kernel32 = ctypes.windll.kernel32
> # turn on the console ANSI colour handling
> kernel32.SetConsoleMode( kernel32.GetStdHandle( -11 ), 7 )

You should enable the flag in the current mode value and implement
error handling:

import ctypes
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

STD_OUTPUT_HANDLE = -11
ENABLE_VIRTUAL_TERMINAL_PROCESSING = 4
INVALID_HANDLE_VALUE = ctypes.c_void_p(-1).value

kernel32.GetStdHandle.restype = ctypes.c_void_p

hstdout = kernel32.GetStdHandle(STD_OUTPUT_HANDLE)
if hstdout == INVALID_HANDLE_VALUE:
raise ctypes.WinError(ctypes.get_last_error())

mode = ctypes.c_ulong()
if not kernel32.GetConsoleMode(hstdout, ctypes.byref(mode)):
raise ctypes.WinError(ctypes.get_last_error())

mode.value |= ENABLE_VIRTUAL_TERMINAL_PROCESSING
if not kernel32.SetConsoleMode(hstdout, mode):
raise ctypes.WinError(ctypes.get_last_error())
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UHIZNALCL7UGX5LXJACHHKOHMUMXACKN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous

2020-06-01 Thread Eryk Sun
On 5/25/20, Christopher Barker  wrote:
> On Mon, May 25, 2020 at 10:59 AM Steve Barnes 
> wrote:
>
>> On Windows
>> https://freetechtutors.com/create-virtual-hard-disk-using-diskpart-windows/
>> gives a nice description of creating a virtual disk with only operating
>> system commands. Note that it shows the commands being used interactively
>> but it can also be scripted by starting diskpart /s script.txt -
>
> ...
>
>> it does need to be run as admin.
>>
>
> Well, darn. That seriously reduces its usefulness.

Creating and mounting a VHD can be implemented in PowerShell as well,
but it requires installing the Hyper-V management tools and services
(not the hypervisor itself). The hyper-v cmdlets for managing virtual
disks (e.g. mount-vhd) also require administrator access, since the
underlying API does (e.g. AttachVirtualDisk). A new system service and
client application would have to be developed in order to allow
standard users to manage virtual disks.

Here's a PowerShell example to create, format, and mount a volume on a
10 MiB virtual disk. This example mounts the volume both at "V:/" and
at "C:/Mount/vhd_mount":

$vhdpath = 'C:\Mount\temp.vhdx'
$mountpath = 'C:\Mount\vhd_mount'

# create and mount the physical disk as a RAW volume
new-vhd -path $vhdpath -fixed -sizebytes (10 -shl 20)
mount-vhd -path $vhdpath

# create a partition on the disk, format it as NTFS, and assign
# the DOS device name "V:" and label "vhd"
$nd = (get-vhd -path $vhdpath).DiskNumber
new-volume -disknum $nd -filesys ntfs -drive V -friendly vhd

# set a folder mountpoint
mkdir $mountpath
add-partitionaccesspath -drive V -accesspath $mountpath

If no drive letter is desired, use the -accesspath option of the
new-volume cmdlet instead of the -driveletter option.

The following command dismounts the disk:

dismount-vhd -path $vhdpath

The mountpoint on the empty directory remains set but inaccessible
once the disk is dismounted. You can can delete this directory if the
disk won't be mounted again.  Or, while the disk is mounted, you can
remove the mountpoint via remove-partitionaccesspath.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3EEZV32XMNGLGB6Q267MGYHPBSTO55FH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Sanitize filename (path part) 2nd try

2020-05-13 Thread Eryk Sun
On 5/13/20, Antoine Pitrou  wrote:
>
> If you know of a system function which accepts filenames with embedded
> NULs (which probably means it also takes the filename length as a
> separate parameter), I'd be curious to know about it.

Windows is layered over the base NT system, which uses counted strings
and a root object namespace that reserves only the path separator,
backslash. Null characters are allowed, at least as far as the object
manager cares, but using them is a bad idea, if only because such
names aren't generally accessible in Windows. But let's look at an
example just for kicks.

When the object manager parses a path up to a Device object (e.g.
"\Device\NamedPipe"), the I/O manager takes over parsing the remaining
path, which calls the device driver's IRP_MJ_CREATE routine with the
remaining path. Whether or not a name with nulls is allowed depends on
the device driver -- or a filesystem driver if the device is mounted.

Almost all filesystem drivers reject a component name that contains
nulls as invalid. One exception is the named-pipe filesystem (NPFS).
NPFS doesn't disallow any characters. It even allows backslash in pipe
names since it doesn't support subdirectories, and if you check via
os.listdir('//./pipe'), you should see several Winsock pipes with
backslash in their name.

Creating a pipe with nulls in its name is impossible via WINAPI
CreateNamedPipeW. It requires native NtCreateNamedPipeFile, with the
name passed in an OBJECT_ATTRIBUTES record [1]. This system function
is undocumented, but just to show that it's possible in principle, I
created a pipe named "spam\x00eggs". We can query the name via
GetFileInformationByHandleEx: FileNameInfo [2], which returns a
counted string:

>>> GetFileInformationByHandleEx(h, FileNameInfo)
'\\spam\x00eggs'

The name is in the root path of the device, but we don't get the
fully-qualified name "\\Device\\NamedPipe\\spam\x00eggs". WINAPI
GetFinalPathNameByHandleW [3] can figure this out, at least for the
native NT path (from NtQueryObject). However, it works with
null-terminated strings, so the pipe name gets truncated as "spam":

>>> flags = VOLUME_NAME_NT | FILE_NAME_OPENED
>>> GetFinalPathNameByHandle(h, flags)
'\\Device\\NamedPipe\\spam'

[1]: 
https://docs.microsoft.com/en-us/windows/win32/api/ntdef/ns-ntdef-_object_attributes
[2]: 
https://docs.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_name_info
[3]: 
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfinalpathnamebyhandlew
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6EB4UMWV3TRMWL6RPY2KFU7PYJTYF4SY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Sanitize filename (path part) 2nd try

2020-05-12 Thread Eryk Sun
On 5/11/20, Oleg Broytman  wrote:
> On Mon, May 11, 2020 at 09:12:52PM -, Steve Jorgensen
>  wrote:
>
>> When the platform is Windows, certainly, ":" should not be
>> allowed, and perhaps colon should not be allowed at all.

The meaning of ":name" is context dependent. If it occurs at
the beginning of a path, it's relative to the working directory on
drive ":", which defaults to the root directory on the drive.
For example, if the working directory on drive "X:" is "X:\spam\eggs",
then "X:foo" resolves to "X:\spam\eggs\foo". "X:foo" in this context
is not a valid component name; it's actually a filepath.

Otherwise ":" is part of an NTFS or ReFS stream path, where
":" is the stream delimiter. To be valid, it needs to be followed by
either the name of the stream or the name plus the type, e.g.
"filename:streamname" or "filename:streamname:streamtype".

Should file streams be supported?

More on File Streams

An open or create will fail as an invalid filename if it uses invalid
stream syntax or references a stream type that's unknown, or if the
filesystem doesn't support streams and disallows colon in filenames
(e.g. FAT32).

The stream name can be empty to indicate an anonymous or default
stream, but only if the stream type is specified. For example, in NTFS
"filename::$DATA" is the anonymous data stream in a file named
"filename". For a regular data file, it's the same as just accessing
"filename".

A directory can have named data streams, but it cannot have an
anonymous data stream. The default stream in a directory is an index
stream named "$I30". The following are equivalent names for a
directory in NTFS: "dirname", "dirname::$INDEX_ALLOCATION", and
"dirname:$I30:$INDEX_ALLOCATION". But "dirname:$I30" doesn't work
because the default stream type is $DATA.

To access a stream in a single-letter filename relative to the current
directory, the current directory has to be referenced explicitly via
the "." component. For example, "./C:spam" is a stream named "spam" in
a file named "C" that's in the current working directory, but "C:spam"
is a file named "spam" in the working directory on drive "C:".

>Forbidden characters:
>
> chr(0) < > : " / \ | ? *
>
> characters in range from chr(1) through chr(31),

See the above discussion regarding ":". An NTFS stream name can
include any character except for nul (0), colon, backslash, and slash.

The characters *?"<> are the 5 wildcards characters that almost all NT
filesystems disallow in filenames. These are important to disallow
because the filesystem driver (in the kernel) is expected to support
filtering a directory listing with a wildcard pattern. NT's * and ?
wildcards have Unix shell semantics. The other three are DOS_DOT ("),
DOS_STAR (<), and DOS_QM (>), which help to emulate MS-DOS behavior.

The vertical bar or pipe (|) has no significance in filepaths, but
it's a special shell character that's usually disallowed in filenames.
Control characters 1-31 usually are also disallowed. That said, some
non-Microsoft filesystems may allow these characters. For example, the
VirtualBox shared-folder filesystem allows pipe and control characters
in filenames.

> a space or a period at the end of file/directory name.

Trailing spaces and dots are stripped from the final path component in
almost all contexts. Except "\\?\" device paths are never normalized
in an open or create context. For example, creating "\\?\C:\Temp\spam.
. . " will name the file "spam. . . " instead of the normal name
"spam". The name "spam. . . " will appear in the directory listing,
but opening it will require using a "\\?\" device path.

> Forbidden file names (with any extensions):
>
> CON, PRN, AUX, NUL,
> COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9,
> LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.

In an attempt to replicate how MS-DOS implemented devices, Windows
reserves DOS device names such as "NUL" in the final component of DOS
drive-letter paths and relative paths. They are not reserved in the
final component of UNC and device paths, though a server may disallow
them by policy, as Microsoft's SMB server does.

Matching the device name ignores everything after a trailing colon or
dot that follows the name with 0 or more intervening spaces. This is
more than ignoring an extension, which is typically taken as the
characters following the last dot in a filename.

"CONIN$" and "CONOUT$" are mistakenly excluded from the documented
list of reserved DOS device names. Windows has always reserved them as
unqualified relative names in a create/open context. Starting with
Windows 8, they're reserved exactly the same as the classic DOS device
names.

Examples with trailing dots and spaces:

>>> os.getcwd()
'C:\\'
>>> nt._getfullpathname('spam. . . ')
'C:\\spam'
>>> nt._getfullpathname('foo/spam. . . ')
'C:\\foo\\spam'

DOS devices:

>>> nt._getfullpathname('conin$:spam.eggs')
'.\\conin$'
>>> nt._getfullpathname('foo/conin$  

[Python-ideas] Re: Improve the Windows python installer to reduce new user confusion

2020-04-11 Thread Eryk Sun
On 4/11/20, Barry Scott  wrote:
>> On 10 Apr 2020, at 20:14, Christopher Barker  wrote:
>>
>> Also, if order to get python top level scripts to work, there needs to be
>> a PATH entry for that, too.
>
> Do you mean the #! lines? That is taken care of by py.exe and how it was
> installed.

I think by "top level" Christopher means running "foo.py" directly, or
just "foo" if ".PY" is in PATHEXT.  The installer's option to update
environment variables adds the "Scripts" directory to PATH and adds
the .PY and .PYW file extensions to PATHEXT. It would be more flexible
to split this out as an independent option. (Note that the "Scripts"
directory also contains scripts that are embedded in a launcher
executable, such as pip.exe, which distlib uses for entry-point
scripts. But many entry-point scripts are commonly run via py.exe
instead, such as `py -m pip`.)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/22OFVG52S522CDTRZXLTL4DEX26522RS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Improve the Windows python installer to reduce new user confusion

2020-04-10 Thread Eryk Sun
On 4/10/20, MRAB  wrote:
> On 2020-04-10 20:14, Christopher Barker wrote:
>
>> How does py.exe get on the PATH?
>>
> py.exe goes into the Windows folder, which is on the PATH.

That's the typical setup, but a standard user that can't get OTS
administrator access has to install the launcher just for the current
user, in "%LocalAppData%\Programs\Python\Launcher", which the
installer automatically adds to the per-user PATH.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5RIUDB7L4ON3SFIPEYBLOZCXQN3WUIYB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: About python3 on windows

2020-03-25 Thread Eryk Sun
On 3/25/20, Barry Scott  wrote:
>> On 25 Mar 2020, at 09:15, Eryk Sun  wrote:
>>
>> That is not consistent with Unix. env is supposed to search PATH for
>> the command. However, the launcher does not search PATH for a
>> versioned command such as "python3". Instead it uses the highest
>> version that's registered for 3.x or 2.x, respectively, or the version
>> set by PY_PYTHON3 or PY_PYTHON2 if defined, respectively.
>
> I think the reasoning is that the whole point of the py.exe is to avoid
> having users edit their PATH on Windows. And further the thinking
> goes that you do not need the alternatively named python programs.

The py launcher's "env" command searches PATH for anything from
"python" to "notepad" -- but not for a versioned Python command such
as "python3" or "python2".  It always uses a registered installation
in this case, which is at the very least problematic when using
"#!/usr/bin/env python3" in an active virtual environment. Paul Moore
will probably suggest that the script should use "#!/usr/bin/env
python" instead, but that will run 2.x in most Unix systems unless a
3.x environment is active. We can assume that such a script requires
3.x and is meant to run flexibly, in or out of an active environment.

I'd prefer a consistent implementation of the "env" command that
doesn't special case versioned "pythonX[.Y]" commands compared to
plain "python". But another option that will at least make
virtual-environment users happy would be for "env" to check for an
active VIRTUAL_ENV and read its Python version from "pyvenv.cfg".
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NEO6ZBIIGL2JWVG77SHUKNTWLY2ZFJ5G/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: About python3 on windows

2020-03-25 Thread Eryk Sun
On 3/25/20, Steve Barnes  wrote:
>> Except it's not necessarily what the original post wants. The OP wants the
>> shebang "#!/usr/bin/env python3" to "work everywhere by
>> default", for which I assume it's implied that it should work consistently
>> everywhere. I'd prefer for the launcher's env search to also
>> support versioned "pythonX[.Y][-32|-64]" commands such as "python3".
>
> The windows launcher already does support this with shebangs of:
> #!/usr/bin/env python3 # Launch with the latest/preferred version of python3
> #!/usr/bin/env python2 # Launch with the latest/preferred version of python2

That is not consistent with Unix. env is supposed to search PATH for
the command. However, the launcher does not search PATH for a
versioned command such as "python3". Instead it uses the highest
version that's registered for 3.x or 2.x, respectively, or the version
set by PY_PYTHON3 or PY_PYTHON2 if defined, respectively.

> #!/usr/bin/env python # Launch with the latest/preferred version of python 2
> unless PY_PYTHON=3[.n[-64/32]] is set or py.ini has the same in.

In this case "env" first searches PATH before falling back on
registered installations and PY_PYTHON, which is correct -- at least
for the PATH search. I would prefer that "env" never checks registered
installations. For the registry fallback, it should instead check the
user and system "App Paths" key, like what ShellExecuteExW does.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YX4DOI4MWTB7AVL4QMT5EN4TQBNRSHEZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: About python3 on windows

2020-03-25 Thread Eryk Sun
On 3/25/20, Steve Barnes  wrote:
> Of course if, rather than creating symlinks, you create a batch file called
> python3.bat and containing the line:
> @py -3 %*

Batch scripts execute via cmd.exe, with an attached console, and when
Ctrl+C is typed they display a "Terminate batch job (Y/N)?" prompt
when cmd.exe resumes. This makes them a poor substitute for a link.

If the link is created beside "py.exe", it's better to use a relative
symlink. If the link is in another directory, use a shell link (i.e.
shortcut). Set the command to run as "C:\Windows\py.exe -3", or
wherever "py.exe" is installed. The shell API will pass command line
arguments. Clear the shortcut's "start in" field in order to inherit
the parent's working directory. Add ".LNK" to PATHEXT to be able to
run "python3" on the command line instead of requiring "python3.lnk".

> It is also worth mentioning that python (and the py launcher) both accept
> windows paths (\ separated) and *nix paths (/ separated) from the command
> line and that from within scripts the *nix path separator is to be preferred

That's generally true. Note however that the only reliable way to
access a path that exceeds MAX_PATH characters (260, or less depending
on context) is with a \\?\ extended path, which must use backslash.
(Python 3.6+ does support long normal paths in Windows 10, but this
capability has to be enabled at the system level. Plus many scripts
and applications still need to support Windows 7 and 8.)

An extended path is also required to open files with certain reserved
names such as DOS devices (e.g. "con" or "nul:.txt") and names that
end with spaces and dots (e.g. "spam. . ."). But please do not use an
extended path in order to assign reserved names. It just causes
needless problems.

> glob.glob("C:/Users/Gadget/Documents/*.docx") - the only real issue to avoid
> is the fact that Windows paths are case insensitive so names that differ
> only in case changes can & will collide.

A FAT32 filesystem is case insensitive in Unix (e.g. on a portable
drive), so this problem isn't limited to Windows. It's just more
common in Windows.

Also, an NTFS directory tree can be flagged as case sensitive in
Windows 10, but thankfully this isn't commonly used, even by
developers.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HHOB2HGK7AN7ZATWAS2GPBMN3CDOLGKY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: About python3 on windows

2020-03-25 Thread Eryk Sun
On 3/24/20, Mike Miller  wrote:
> On 2020-03-24 11:58, Eryk Sun wrote:
>
>> You can manually copy or symlink python.exe to python3.exe in the
>> installation directory and venv "Scripts" directories. However, it
>> will only be used on the command line, and other contexts that search
>> PATH. Currently the launcher will not use it with a virtual "env"
>> shebang. The launcher will search PATH for "python", but not
>> "python3".
>
> Thanks.  Sure, there are many ways to fix this manually, or work around it.

Except it's not necessarily what the original post wants. The OP wants
the shebang "#!/usr/bin/env python3" to "work everywhere by default",
for which I assume it's implied that it should work consistently
everywhere. I'd prefer for the launcher's env search to also support
versioned "pythonX[.Y][-32|-64]" commands such as "python3".

I'd also prefer the env search to check the user and system "App
Paths" key [1] if the name isn't found in PATH. Each subkey of "App
Paths" is the name of a command such as "python3.exe", for which the
fully-qualified executable filename is the key's default value. This
is the Windows shell API equivalent of creating symlinks in
"~/.local/bin" and "/usr/bin" on Unix systems. ShellExecuteExW checks
"App Paths", but the launcher has to use CreateProcessW, which is
beneath the shell API.

> Would be great if it was consolidated, with one command "to rule them all."

I'm in favor of "py" becoming the cross-platform command to run Python
from the command line, since there's already a lot of inertia in that
direction on Windows. Brett Cannon is working on a Unix version [2].

[1]: 
https://docs.microsoft.com/en-us/windows/win32/shell/app-registration#using-the-app-paths-subkey
[2]: https://crates.io/crates/python-launcher
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VHHZ263OEMNISHMBTPPTC7OVYJNA4KIO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: About python3 on windows

2020-03-24 Thread Eryk Sun
On 3/24/20, Mike Miller  wrote:
>
>  C:\Users\User>python3
>  (App store loads!!)

If installed, the app distribution has an appexec link for
"python3.exe" that actually works.

>  C:\Python38>dir
>   Volume in drive C has no label.
> [snip]
> Note there is no python3.exe binary.

You can manually copy or symlink python.exe to python3.exe in the
installation directory and venv "Scripts" directories. However, it
will only be used on the command line, and other contexts that search
PATH. Currently the launcher will not use it with a virtual "env"
shebang. The launcher will search PATH for "python", but not
"python3".
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6U64JXGJZ4TFTCXJ6X636AYI5QYQLVMX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: About python3 on windows

2020-03-24 Thread Eryk Sun
On 3/24/20, Barry Scott  wrote:
>
> If you have python 2 and 3 installed then
>
>py -3 myscript

"myscript" may have a shebang that runs the "python2" virtual command
(e.g. "#!python2" or "#!/usr/bin/python2") because the script requires
2.x, but using "-3" will override it to run the "python3" virtual
command instead.

The "python2" virtual command defaults to the highest installed
version of 2.x. The "python3" virtual command defaults to the highest
installed version of 3.x. Without a shebang, the "python" virtual
command defaults to the highest installed 3.x, but with a shebang it
defaults to the highest installed 2.x. py without a script (e.g. the
REPL or -c or -m) uses the "python" virtual command, so it defaults to
the highest installed 3.x.

The version to run for the "python" virtual command is set via the
PY_PYTHON environment variable, whether or not there's a shebang.
Similarly the version to run for the "python2" and "python3" virtual
commands is set via PY_PYTHON2 and PY_PYTHON3. No sanity checks are
performed, so "PY_PYTHON2=3" is allowed.

>> #! /usr/bin/env python3
>
> This does work out of the box because py.exe is run when you execute a .py
> in the CMD.

The "/usr/bin/env" virtual command is expected to search PATH. py does
search PATH for "/usr/bin/env python", but for "/usr/bin/env python3"
it uses the "python3" virtual command instead of searching, since
standard Python installations and virtual environments do not include
"python3.exe". There's an open issue for this, but there's no
consensus.

> You can check by doing:
>
> assoc .py
> ftype Python.File
>
> If Python.File is not using py.exe then you can fix that with this command
> from an Admin CMD.
>
> ftype Python.File="C:\windows\py.exe" "%1" %*

CMD's internal assoc and ftype commands are no longer useful in
general. They date back to Windows NT 4 (1996) and have never been
updated. As far as I know, Microsoft has no up-to-date, high level
commands or PowerShell cmdlets to replace assoc and ftype. In
PowerShell I suppose you could pInvoke the shell API (e.g.
AssocQueryStringW).

assoc and ftype only access the basic system file types and progids in
"HKLM\Software\Classes", not the HKCR view that merges in and prefers
"HKCU\Software\Classes" or various other subkeys such as
"Applications" and "SystemFileAssociations". Also, they don't account
for the cached and locked user choice in the shell.

For example, assoc may tell you that ".py" is associated with the
"Python.File" progid. But it's potentially wrong. It's not aware of an
association set in "HKCU\Software\Classes\.py" (e.g. set by a per-user
Python installation). It's also not aware of a locked-in user choice
(i.e. the user selected to always use a particular app), if one is
set, in 
"HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\FileExts\.py\UserChoice".

The user should set the file association using the settings dialog for
choosing default apps by file type or the "open with" dialog on the
right-click context menu. If the application chooser doesn't include a
Python icon with a rocket on it, then probably the launcher and
Python.File progid are installed for all users, but there's a per-user
association in "HKCU\Software\Classes\.py" that's overriding the
system setting. Deleting the default value in the latter key should
restore the launcher to the list of choices.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/C3ZH4TZLN3APOF3PVSEEZM6XIVCIIFVG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Control adding script path/CWD to sys.path

2020-02-24 Thread Eryk Sun
On 2/24/20, jdve...@gmail.com  wrote:
>
> I try to use along with -m (`python -I -m a.b`) and get this error: "python:
> Error while finding module specification for 'a.b' (ModuleNotFoundError: No
> module named 'a')".

This is a use case for -m that requires adding the working directory
to sys.path. I work in virtual environments, and I don't navigate into
a package and execute modules. The target package is always either in
the standard library or installed in site-packages, and the module is
executed from the top level. So for me adding the working directory is
a feature I never need, and I completely forgot about why anyone would
want it.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RAZRNATFDLGS3MIAIEXXLKQTD447UK3P/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Control adding script path/CWD to sys.path

2020-02-24 Thread Eryk Sun
On 2/24/20, jdve...@gmail.com  wrote:
>
> It is the intended and the expected behaviour. The working directory is
> always added to the sys.path.

You mean always in this particular context, i.e. the working directory
is added normally when executing a command via -c or a module as a
script via -m. When executing a script normally, the script directory
gets added, which is reasonably secure.

Adding the working directory to sys.path is ok for the interactive
shell and -c commands, but I don't understand why it gets added with
-m, which is a security hole, and to me an annoyance. It can be
disabled with isolated mode, but that's a blunt instrument that
disables too much.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6SSCBUIPMFJC2ZR67DVTHICN3B5UDX2F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add logging to subprocess.Popen

2020-02-24 Thread Eryk Sun
On 2/24/20, Guido van Rossum  wrote:
>
> The stdlib does very little logging of its own -- logging is up to
> the application.

It's not logging per se, but the standard library does have an
extensive and growing list of audit events that are intended to assist
with testing, logging and security monitoring.

https://docs.python.org/3/library/audit_events.html
https://www.python.org/dev/peps/pep-0578

An event is generated for subprocess.Popen that includes the
executable, args, cwd, and env parameters. There's no event for the
result, however.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OZREZJRFMJERP4OMG23BEPVPGPUYBXU7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Recommend UTF-8 mode on Windows

2020-01-14 Thread Eryk Sun
On 1/14/20, Inada Naoki  wrote:
>
> UTF-8 mode shouldn't take precedence over legacy FS encoding.
>
> Mercurial uses legacy encoding for file paths.  They use
> sys._enablelegacywindowsfsencoding() on Windows.
> https://www.mercurial-scm.org/repo/hg/rev/8d5489b048b7

This runtime call can override the initial configuration that's based
on environment variables and -X options.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FXFBPIIZDFEZR5WVXVOMKMA5KLK3SNGH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Recommend UTF-8 mode on Windows

2020-01-12 Thread Eryk Sun
On 1/10/20, Andrew Barnert via Python-ideas  wrote:
> On Jan 10, 2020, at 03:45, Inada Naoki  wrote:
>
> Also, PYTHONUTF8 is only supported on Unix, so presumably it’s ignored if
> you set it on Windows, right?

The implementation of UTF-8 mode (i.e. -Xutf8) is cross-platform,
though I think it could use some tweaking for Windows.

>> I believe UTF-8 should be chosen by default for text encoding.
>
> Correct me if I’m wrong, but I think in Python 3.7 on Windows 10, the
> filesystem encoding is already UTF-8, and the stdio console files are UTF-8
> (but under the covers actually wrap the native UTF-16 console APIs instead
> of using msvcrt stdio), so the only issue is the locale encoding, right?

Yes, 3.6+ in Windows defaults to UTF-8 for console I/O and the
filesystem encoding. If for some reason you need the legacy behavior,
it can be enabled via the following environment variables [1]:
PYTHONLEGACYWINDOWSSTDIO and PYTHONLEGACYWINDOWSFSENCODING.

Setting PYTHONLEGACYWINDOWSFSENCODING switches the filesystem encoding
to "mbcs". Note that this does not use the system MBS (multibyte
string) API. Python simply transcodes between UTF-16 and ANSI instead
of UTF-8. Currently this setting takes precedence over UTF-8 mode, but
I think it should be the other way around.

Setting PYTHONLEGACYWINDOWSSTDIO uses the console input codepage for
stdin and the console output codepage for stdout and stderr, but only
if isatty is true and the process is attached to a console (see
_Py_device_encoding in Python/fileutils.c). Otherwise it uses the
system ANSI codepage.

Note that this setting is currently **broken** in 3.8. In
Python/initconfig.c, config_init_stdio_encoding calls
config_get_locale_encoding to set config->stdio_encoding. This always
uses the system ANSI codepage (e.g. 1252), even for console files for
which this choice makes no sense.

Combining UTF-8 mode with legacy Windows standard I/O is generally
dysfunctional. The result is mojibake, unless the console codepage
happens to be UTF-8. I'd prefer UTF-8 mode to take precedence over
legacy standard I/O mode and have it imply non-legacy I/O.

In both of the above cases, what I'd prefer is for UTF-8 mode to take
precedence over legacy modes, i.e. to disable
config->legacy_windows_fs_encoding and config->legacy_windows_stdio in
the startup configuration.

Regarding the MBS API and UTF-8

In Windows 10, it's possible to set the ANSI and OEM codepages to
UTF-8 at both the system level (in the system control panel) and the
application level (in the application manifest). But many functions
are still only available in the WCS (wide-character string) API, such
as GetLocaleInfoEx, GetFileInformationByHandleEx, and
SetFileInformationByHandle. I don't know whether Microsoft plans to
implement MBS wrappers in these cases.

If the ANSI codepage is UTF-8, then the MBS file API (e.g.
CreateFileA) is basically equivalent to Python's UTF-8 filesystem
encoding. There's one exception. Python uses the "surrogatepass" error
handler, which allows invalid surrogate codes (i.e. a "Wobbly" WTF-8
encoding). In contrast, the MBS API translates invalid surrogates to
the replacement character (U+FFFD). I think Python's choice is more
sensible because the WCS file API (e.g. CreateFileW) and filesystem
drivers do not verify that strings are valid Unicode.

The console uses the system OEM codepage as its default I/O codepage.
Setting OEM to UTF-8 (at the system level, not at the application
level), or manually setting the codepage to UTF-8 via `chcp.com
65001`, is a potential problem because the console doesn't support
reading non-ASCII UTF-8 strings via ReadFile or ReadConsoleA. Prior to
Windows 10, it returns an empty string for this case, which looks like
EOF. The new console in Windows 10 instead translates each non-ASCII
character as a null byte (e.g. "SPĀM" -> "SP\x00M"), which is better
but still pretty much useless for reading non-English input. Python
3.6+ is for the most part immune to this. In the default
configuration, it uses ReadConsoleW to read UTF-16 instead of relying
on the input codepage. (Low-level os.read is not immune to the
problem, however, because it is not integrated with the new console
I/O implementation.)

[1] https://docs.python.org/3/using/cmdline.html#environment-variables
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G2NOSM6EFOOO5WCLTCEWJ7DWS57DDZTY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Suggestion: Windows launcher default to not using pre-releases by default

2019-07-10 Thread eryk sun
On 7/10/19, Brendan Barnwell  wrote:
>
>   I agree that it seems the real problem here is the lack of a real way
> to determine if an available version is a real release or a
> prerelease/beta.  Is it not possible to change that, so that it is
> possible for the launcher to quickly and easily determine the highest
> release version available?

In a previous reply, I gave a simple example based on FIELD3 of the
file version (or product version) that's embedded in python[w].exe.
This doesn't require changes to the registry, and doesn't require
running the executable to parse version information from stdout, which
would be relatively slow. It will only work for releases that have the
version info in the executable. I don't recall when we started adding
it, but I know the 2.7 executable doesn't have it.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/J7GPFOZYKEBI5BHYIWZIPVYX2UWBMLA2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Suggestion: Windows launcher default to not using pre-releases by default

2019-07-10 Thread eryk sun
On 7/9/19, Steve Barnes  wrote:
>
> Currently the py[w] command will launch the latest python by default however
> I feel that this discourages the testing of pre-releases & release
> candidates as once they are installed they will become the default. What I
> would like is for the default to be the highest version number of a full
> release but the user to be able to specify a specific version even if it is
> a pre-release.

With the existing launcher, if we install a pre-release candidate, we
can set the PY_PYTHON environment variable to make the launcher
default to a preferred stable release.

To modify the launcher to detect a "final" build, we can check the
file version from the PE image's FIXEDFILEINFO [1]. It consists of
four 16-bit values: PY_MAJOR_VERSION, PY_MINOR_VERSION, FIELD3,
PYTHON_API_VERSION. What we want is FIELD3, which is the upper WORD in
the least significant DWORD (i.e. dwFileVersionLS >> 16). FIELD3 is
computed as

micro * 1000 + levelnum * 10 + serial

where levelnum is

alpha: 10
beta: 11
candidate: 12
final: 15

and serial is 0-9. The executable is a "final" release if FIELD3
modulo 1000 is at least 150. Here's a quick ctypes example with Python
3.7.3:

version = ctypes.WinDLL('version', use_last_error=True)
szBlock = version.GetFileVersionInfoSizeW(sys.executable, None)
block = (ctypes.c_char * szBlock)()
version.GetFileVersionInfoW(sys.executable, 0, szBlock, block)

pinfo = ctypes.POINTER(ctypes.c_ulong)()
szInfo = ctypes.c_ulong()
version.VerQueryValueW(block, '\\', ctypes.byref(pinfo),
ctypes.byref(szInfo))

>>> (pinfo[3] >> 16) % 1000
150

>>> sys.version_info
sys.version_info(major=3, minor=7, micro=3, releaselevel='final', serial=0)

[1]: 
https://docs.microsoft.com/en-us/windows/win32/api/verrsrc/ns-verrsrc-tagvs_fixedfileinfo
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IZBCCXXOZNDW6XEZUP3WSGSRRIXVJOVG/
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] shutil.symlink to allow non-race replacement of existing link targets

2019-05-14 Thread eryk sun
On 5/14/19, Steven D'Aprano  wrote:
>
> On posix systems, you should be able to use chattr +i to make the file
> immutable, so that the attacker cannot remove or replace it.

Minor point of clarification. File attributes, and APIs to access
them, are not in the POSIX standard. chattr is a Linux command that
wraps the filesystem IOCTLs for getting and setting file attributes.
There's no chattr system call, so thus far it's not supported in
Python's os module. BSD and macOS have chflags, which supports both
system- and user-immutable file attributes. Python supports it as
os.chflags.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] shutil.symlink to allow non-race replacement of existing link targets

2019-05-14 Thread eryk sun
On 5/14/19, Serge Matveenko  wrote:
>
> My point was that in case of `os.symlink` vs `shutil.symlink` it is
> not obvious how they are different even taking into account their
> namespaces.

I prefer to reserve POSIX system call names if possible, unless it's a
generic name such as "open" or "close".

Note that there's also the possibility of extending pathlib's
`symlink_to` method.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Provide additional debug info for OSError and WindowsError

2019-04-12 Thread eryk sun
On 4/12/19, Giampaolo Rodola'  wrote:
>
> As such I was thinking that perhaps it would be nice to provide 2 new
> cPython APIs:
>
> PyErr_SetFromErrnoWithMsg(PyObject *type, const char *msg)
> PyErr_SetFromWindowsErrWithMsg(int ierr, const char *msg)
> PyErr_SetExcFromWindowsErrWithMsg(PyObject *type, int ierr, const char
> *msg)
>
> With this in place also OSError and WindowsError would probably have
> to host a new "extramsg" attribute or something (but not necessarily).

Existing error handling would benefit from this proposal. win32_error
[1], win32_error_object_error, and PyErr_SetFromWindowsErrWithFunction
[2] take a function name that's currently ignored.

[1]: https://github.com/python/cpython/blob/v3.7.3/Modules/posixmodule.c#L1403
[2]: https://github.com/python/cpython/blob/v3.7.3/PC/winreg.c#L26
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add subprocess.Popen suspend() and resume()

2019-03-24 Thread eryk sun
On 3/24/19, Giampaolo Rodola'  wrote:
> On Wed, Mar 20, 2019 at 11:19 PM eryk sun  wrote:
>
>> This code repeatedly calls PsGetNextProcessThread to walk the
>> non-terminated threads of the process in creation order (based on a
>> linked list in the process object) and suspends each thread via
>> PsSuspendThread. In contrast, a Tool-Help thread snapshot is
>> unreliable since it won't include threads created after the snapshot
>> is created. The alternative is to use a different undocumented system
>> call, NtGetNextThread [2], which is implemented via
>> PsGetNextProcessThread. But that's slightly worse than calling
>> NtSuspendProcess.
>>
>> [1]: https://stackoverflow.com/a/11010508
>> [2]: https://github.com/processhacker/processhacker/blob/v2.39/
>>  phnt/include/ntpsapi.h#L848
>
> FWIW older psutil versions relied on Thread32Next / OpenThread /
> SuspendThread / ResumeThread, which appear similar to these Ps*
> counterparts (and I assume have the same drawbacks).

This is the toolhelp snapshot I was talking about, which is an
unreliable way to pause a process since it doesn't include threads
created after the snapshot. For TH32CS_SNAPTHREAD, it's based on
calling NtQuerySystemInformation: SystemProcessInformation to take a
snapshot of all running processes and threads at the time. This buffer
gets written to a shared section, and the section handle is returned
as the snapshot handle. Thread32First and Thread32Next are called to
walk the buffer a record at a time by temporarily mapping the section
with NtMapViewOfSection and NtUnmapViewOfSection.

In contrast, NtSuspendProcess is based on PsGetNextProcessThread,
which walks a linked list of the non-terminated threads in the
process. Unlike a snapshot, this won't miss threads created after we
start, since new threads are appended to the list. To implement this
in user mode with SuspendThread would require the NtGetNextThread
system call that's implemented via PsGetNextProcessThread. But that's
just trading one undocumented system call for another at the expense
of a more complicated implementation.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add subprocess.Popen suspend() and resume()

2019-03-20 Thread eryk sun
On 3/18/19, Giampaolo Rodola'  wrote:
>
> I've been having these 2 implemented in psutil for a long time. On
> POSIX these are convenience functions using os.kill() + SIGSTOP /
> SIGCONT (the same as CTRL+Z  / "fg"). On Windows they use
> undocumented NtSuspendProcess and NtResumeProcess Windows
> APIs available since XP.

Currently, Windows Python only calls documented C runtime-library and
Windows API functions. It doesn't directly call NT runtime-library and
system functions. Maybe it could in the case of documented functions,
but calling undocumented functions in the standard library should be
avoided. Unfortunately, without NtSuspendProcess and NtResumeProcess,
I don't see a way to reliably implement this feature for Windows. I'm
CC'ing Steve Dower. He might say it's okay in this case, or know of
another approach.

DebugActiveProcess, the other simple approach mentioned in the linked
SO answer [1], is unreliable and has the wrong semantics.  A process
only has a single debug port, so DebugActiveProcess will fail the PID
as an invalid parameter if another debugger is already attached to the
process. (The underlying NT call, DbgUiDebugActiveProcess, fails with
STATUS_PORT_ALREADY_SET.) Additionally, the semantics that I expect
here, at least for Windows, is that each call to suspend() will
require a corresponding call to resume(), since it's incrementing the
suspend count on the threads; however, a debugger can't reattach to
the same process. Also, if the Python process exits while it's
attached as a debugger, the system will terminate the debugee as well,
unless we call DebugSetProcessKillOnExit(0), but that interferes with
the Python process acting as a debugger normally, as does this entire
wonky idea. Also, the debugging system creates a thread in the debugee
that calls NT DbgUiRemoteBreakin, which executes a breakpoint. This
thread is waiting, but it's not suspended, so the process will never
actually appear as suspended in Task Manager or Process Explorer.

That leaves enumerating threads in a snapshot and calling OpenThread
and SuspendThread on each thread that's associated with the process.
In comparison, let's take an abridged look at the guts of
NtSuspendProcess.

nt!NtSuspendProcess:
...
mov r8,qword ptr [nt!PsProcessType]
...
callnt!ObpReferenceObjectByHandleWithTag
...
callnt!PsSuspendProcess
...
mov ebx,eax
callnt!ObfDereferenceObjectWithTag
mov eax,ebx
...
ret

nt!PsSuspendProcess:
...
callnt!ExAcquireRundownProtection
cmp al,1
jne nt!PsSuspendProcess+0x74
...
callnt!PsGetNextProcessThread
xor ebx,ebx
jmp nt!PsSuspendProcess+0x62

nt!PsSuspendProcess+0x4d:
...
callnt!PsSuspendThread
...
callnt!PsGetNextProcessThread

nt!PsSuspendProcess+0x62:
...
testrax,rax
jne nt!PsSuspendProcess+0x4d
...
callnt!ExReleaseRundownProtection
jmp nt!PsSuspendProcess+0x79

nt!PsSuspendProcess+0x74:
mov ebx,0C10Ah (STATUS_PROCESS_IS_TERMINATING)

nt!PsSuspendProcess+0x79:
...
mov eax,ebx
...
ret

This code repeatedly calls PsGetNextProcessThread to walk the
non-terminated threads of the process in creation order (based on a
linked list in the process object) and suspends each thread via
PsSuspendThread. In contrast, a Tool-Help thread snapshot is
unreliable since it won't include threads created after the snapshot
is created. The alternative is to use a different undocumented system
call, NtGetNextThread [2], which is implemented via
PsGetNextProcessThread. But that's slightly worse than calling
NtSuspendProcess.

[1]: https://stackoverflow.com/a/11010508
[2]: 
https://github.com/processhacker/processhacker/blob/v2.39/phnt/include/ntpsapi.h#L848
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Running Python commands from a Shell

2019-02-01 Thread eryk sun
On 2/1/19, Steven D'Aprano  wrote:
> On Fri, Feb 01, 2019 at 07:21:47PM -0600, eryk sun wrote:
>
>> As soon as  "pipe" is mentioned, anyone familiar with the REPL's
>> behavior with pipes should know that making this work will require the
>> -i command-line option to force interactive mode. Otherwise stdout
>> will be fully buffered. For example:
> [...]
>
> I wonder... could Python automatically detect when it is connected to
> pipes and switch buffering off?

In most cases we want full buffering when standard I/O is a pipe or
disk file. It's more efficient to read/write large chunks from/to the
OS.

In another message I saw -u mentioned to disable buffering. But that's
not sufficient. We need -i to force running the built-in REPL over a
pipe, and optionally -q to quiet the initial banner message.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Option of running shell/console commands inside the REPL

2019-02-01 Thread eryk sun
On 2/1/19, Terry Reedy  wrote:
> On 2/1/2019 3:31 PM, Oleg Broytman wrote:
>
>> Python REPL is missing the following batteries:
>> * Persistent history;

Python's built-in REPL relies on the readline module for history. In
Windows you'll need to install pyreadline, an implementation that uses
the Windows console API via ctypes.

Out of the box, Python uses the the built-in line editing and history
that's provided by the Windows console host (conhost.exe). There's an
undocumented function to read this history (as used by doskey.exe),
but there's no function to load lines into it. I suppose it could be
replayed manually in a loop that calls WriteConsoleInputW and
ReadConsoleW.

> * Windows Console holds a maximum of  characters,

 lines, not characters.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Running Python commands from a Shell

2019-02-01 Thread eryk sun
On 2/1/19, Steven D'Aprano  wrote:
> On Fri, Feb 01, 2019 at 04:28:25PM -0600, Dan Sommers wrote:
>
>> As I indicated in what you quoted, shell co-processes allow you to run a
>> command in the background and interact with that command from your
>> shell.
>
> Okay, but what does that mean in practice? What does it require to make
> it work with Python? What is your expected input and output?

bash coproc runs a process in the background with stdin and stdout
redirected to pipes. The file descriptors for our end of the pipes are
available in an array with the given name (e.g. P3). The default array
name is COPROC.

As soon as  "pipe" is mentioned, anyone familiar with the REPL's
behavior with pipes should know that making this work will require the
-i command-line option to force interactive mode. Otherwise stdout
will be fully buffered. For example:

$ coproc P3 { python3 -qi 2>&1; }
[1] 16923
$ echo 'import sys; print(sys.version)' >&${P3[1]}

$ read -t 1 <&${P3[0]} && echo $REPLY
>>> 3.6.7 (default, Oct 22 2018, 11:32:17)

$ read -t 1 <&${P3[0]} && echo $REPLY
[GCC 8.2.0]

$ read -t 1 <&${P3[0]} && echo $REPLY
$ echo 'sys.exit(42)' >&${P3[1]}
$
[1]+  Exit 42 coproc P3 { python3 -qi 2>&1; }

> And are we supposed to know what ">&${P3[1]}" does? It looks like your
> cat walked over your keyboard.

It redirects the command's standard output (>) to the file descriptor
(&) in index 1 of the P3 array (${P3[1]}), which is our end of the
pipe that's connected to stdin of the co-process.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] struct.unpack should support open files

2018-12-25 Thread eryk sun
On 12/25/18, Steven D'Aprano  wrote:
> On Tue, Dec 25, 2018 at 04:51:18PM -0600, eryk sun wrote:
>>
>> Alternatively, we can memory-map the file via mmap. An important
>> difference is that the mmap buffer interface is low-level (e.g. no
>> file pointer and the offset has to be page aligned), so we have to
>> slice out bytes for the given offset and size. We can avoid copying
>> via memoryview slices.
>
> Seems awfully complicated. How do we do all these things, and what
> advantage does it give?

Refer to the mmap and memoryview docs. It is more complex, not
significantly, but not something I'd suggest to a novice. Anyway,
another disadvantage is that this requires a real OS file, not just a
file-like interface. One possible advantage is that we can work
naively and rely on the OS to move pages of the file to and from
memory on demand. However, making this really convenient requires the
ability to access memory directly with on-demand conversion, as is
possible with ctypes (records & arrays) or numpy (arrays).

Out of the box, multiprocessing works like this for shared-memory
access. For example:

import ctypes
import multiprocessing

class Record(ctypes.LittleEndianStructure):
_pack_ = 1
_fields_ = (('a', ctypes.c_int),
('b', ctypes.c_char * 4))

a = multiprocessing.Array(Record, 2)
a[0].a = 1
a[0].b = b'spam'
a[1].a = 2
a[1].b = b'eggs'

>>> a._obj


Shared values and arrays are accessed out of a heap that uses arenas
backed by mmap instances:

>>> a._obj._wrapper._state
((, 0, 16), 16)
>>> a._obj._wrapper._state[0][0].buffer


The two records are stored in this shared memory:

>>> a._obj._wrapper._state[0][0].buffer[:16]
b'\x01\x00\x00\x00spam\x02\x00\x00\x00eggs'

>> We can also use ctypes instead of
>> memoryview/struct.
>
> Only if you want non-portable code.

ctypes has good support for at least Linux and Windows, but it's an
optional package in CPython's standard library and not necessarily
available with other implementations.

> What advantage over struct is ctypes?

If it's available, I find that ctypes is often more convenient than
the manual pack/unpack approach of struct. If we're writing to the
file, ctypes lets us directly assign data to arrays and the fields of
records on disk (the ctypes instance knows the address and its data
descriptors handle converting values implicitly). The tradeoff is that
defining structures in ctypes can be tedious (_pack_, _fields_)
compared to the simple format strings of the struct module. With
ctypes it helps to already be fluent in C.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] struct.unpack should support open files

2018-12-25 Thread eryk sun
On 12/24/18, Drew Warwick  wrote:
> The struct unpack API is inconvenient to use with files. I must do:
>
> struct.unpack(fmt, file.read(struct.calcsize(fmt))

Alternatively, we can memory-map the file via mmap. An important
difference is that the mmap buffer interface is low-level (e.g. no
file pointer and the offset has to be page aligned), so we have to
slice out bytes for the given offset and size. We can avoid copying
via memoryview slices. We can also use ctypes instead of
memoryview/struct.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP proposal -- Pathlib Module Should Contain All File Operations -- version 2

2018-03-25 Thread eryk sun
On Sat, Mar 17, 2018 at 10:42 AM, George Fischhof  wrote:
>
> All functions from os module accept path-like objects,
> and none of the shutil functions.

shutil indirectly supports __fspath__ paths via os and os.path. One
exception is shutil.disk_usage() on Windows, which only supports str
strings. This is fixed in 3.7, in resolution of issue 32556. Maybe it
should be backported to 3.6.

I like the idea of a high-level library that provides a subset of
commonly used os, io, and shutil functionality in one place. But maybe
a new module isn't required. shutil could be extended since its design
goal is to provide "high-level operations on files and collections of
files".

That said, pathlib's goal to support "concrete paths [that] provide
I/O operations" does seem incomplete. It should support copy,
copytree, rmtree, and move methods. Also, a `parents` option should be
added to Path.rmdir to implement removedirs, which mirrors how
Path.mkdir implements makedirs.

> os.link => path.hardlink_to

I'm surprised this doesn't already exist.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Descouraging the implicit string concatenation

2018-03-14 Thread eryk sun
On Wed, Mar 14, 2018 at 12:18 PM, Facundo Batista
 wrote:
>
> Note that there's no penalty in adding the '+' between the strings,
> those are resolved at compilation time.

The above statement is not true for versions prior to 3.7. Previously
the addition of string literals was optimized by the peephole
optimizer, with a limit of 20 characters. Do you mean to formally
discourage implicit string-literal concatenation only for 3.7+?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Memory limits [was Re: Membership of infinite iterators]

2017-10-20 Thread eryk sun
On Thu, Oct 19, 2017 at 9:05 AM, Stephan Houben  wrote:
>
> I (quickly) tried to get something to work using the win32 package,
> in particular the win32job functions.
> However, it seems setting
> "ProcessMemoryLimit" using win32job.SetInformationJobObject
> had no effect
> (i.e.  a subsequent win32job.QueryInformationJobObject
> still showed the limit as 0)?

Probably you didn't set the JOB_OBJECT_LIMIT_PROCESS_MEMORY flag.
Here's an example that tests the process memory limit using ctypes to
call VirtualAlloc, before and after assigning the current process to
the Job.

Note that the py.exe launcher runs python.exe in an anonymous Job
that's configured to kill on close (i.e. python.exe is killed when
py.exe exits) and for silent breakaway of child processes. In this
case, prior to Windows 8 (the first version to support nested Job
objects), assigning the current process to a new Job will fail, so
you'll have to run python.exe directly, or use a child process via
subprocess. I prefer the former, since a child process won't be
tethered to the launcher, which could get ugly for console
applications.

import ctypes
import winerror, win32api, win32job

kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

MEM_COMMIT = 0x1000
MEM_RELEASE = 0x8000
PAGE_READWRITE = 4

kernel32.VirtualAlloc.restype = ctypes.c_void_p
kernel32.VirtualAlloc.argtypes = (ctypes.c_void_p, ctypes.c_size_t,
ctypes.c_ulong, ctypes.c_ulong)
kernel32.VirtualFree.argtypes = (ctypes.c_void_p, ctypes.c_size_t,
ctypes.c_ulong)

hjob = win32job.CreateJobObject(None, "")

limits = win32job.QueryInformationJobObject(hjob,
win32job.JobObjectExtendedLimitInformation)
limits['BasicLimitInformation']['LimitFlags'] |= (
win32job.JOB_OBJECT_LIMIT_PROCESS_MEMORY)
limits['ProcessMemoryLimit'] = 2**31
win32job.SetInformationJobObject(hjob,
win32job.JobObjectExtendedLimitInformation, limits)

addr0 = kernel32.VirtualAlloc(None, 2**31, MEM_COMMIT,
PAGE_READWRITE)
if addr0:
mem0_released = kernel32.VirtualFree(addr0, 0, MEM_RELEASE)

win32job.AssignProcessToJobObject(hjob,
win32api.GetCurrentProcess())

addr1 = kernel32.VirtualAlloc(None, 2**31, MEM_COMMIT,
PAGE_READWRITE)

Result:

>>> addr0
2508252315648
>>> mem0_released
1
>>> addr1 is None
True
>>> ctypes.get_last_error() == winerror.ERROR_COMMITMENT_LIMIT
True
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-27 Thread eryk sun
On Mon, Mar 27, 2017 at 8:52 PM, Barry  wrote:
> I took to using
>
>  chcp 65001
>
> This puts cmd.exe into unicode mode.

conhost.exe hosts the console, and chcp.com is a console app that
calls GetConsoleCP, SetConsoleCP and SetConsoleOutputCP to show or
modify the console's input and output codepages. It doesn't support
changing them separately.

cmd.exe is just another console client, no different from python.exe
or powershell.exe in this regard. Also, it's unrelated to how Python
uses the console, but for the record, cmd has used the console's
wide-character API since it was ported from OS/2 in the early 90s.

Back then the console was hosted using threads in the csrss.exe system
process, which made sense because the windowing system was hosted
there. When they moved most of the window manager to kernel mode in NT
4 (1996), the console was mostly left behind in csrss.exe. It wasn't
until Windows 7 that it found a new home in conhost.exe. In Windows 8
it got a real device driver instead of using fake file handles. In
Windows 10 it was updated to be less of a franken-window -- e.g. now
it has line-wrapped selection and text reflowing.

Using codepage 65001 (UTF-8) in a console app has a couple of annoying
bugs in the console itself, and another due to flushing of C FILE
streams. For example, reading text that has even a single non-ASCII
character will fail because conhost's encoding buffer is too small. It
handles the error by returning a read of 0 bytes. That's EOF, so
Python's REPL quits; input() raises EOFError; and stdin.read() returns
an empty string. Microsoft should fix this in Windows 10, and probably
will eventually. The Linux subsystem needs UTF-8, and it's silly that
the console doesn't allow entering non-ASCII text in Linux programs.

As was already recommended, I suggest using the wide-character API via
win_unicode_console in 2.7 and 3.5. In 3.6 we get the wide-character
API automatically thanks to Steve Dower's io._WindowsConsoleIO class.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Using Python for end user applications

2017-02-07 Thread eryk sun
On Tue, Feb 7, 2017 at 3:27 PM, Paul Moore  wrote:
> On 7 February 2017 at 14:29, Steve Dower  wrote:
>> You can leave python.exe out of your distribution to avoid it showing up on
>> PATH, or if your stub explicitly LoadLibrary's vcruntime140.dll and then
>> python36.dll you should be able to put them wherever you like.
>
> Understood, but I may need python.exe present if the script uses
> multiprocessing, so I'm trying to avoid doing that (while I'm doing
> things manually, I can do what I like, obviously, but a generic "build
> my app" tool has to be a bit more cautious).
>
> LoadLibrary might work (I'm only calling Py_Main). I seem to recall
> trying this before and having issues but that might have been an
> earlier iteration which made more complex use of the C API. Also, I
> want to load python3.dll (the stable ABI) as I don't want to have to
> rebuild the stub once for each Python version, or have to search for
> the correct DLL in C. But I'll definitely give that a go.

LoadLibrary and GetProcAddress will work, but that would get tedious
if a program needed a lot of Python's API. It's also a bit of a kludge
having to manually call LoadLibrary with a given DLL order.

For the latter, I wish we could simply load python3.dll using
LoadLibraryEx with LOAD_WITH_ALTERED_SEARCH_PATH, but it doesn't work
in good old Windows 7. python3.dll doesn't depend on python3x.dll in
its DLL import table. I discovered in issue 29399 that in this case
the loader in Windows 7 doesn't use the altered search path of
python3.dll to load python3x.dll and vcruntime140.dll.

As you're currently doing (as we discussed last September), creating
an assembly in a subdirectory works in all supported Windows versions,
and it's the most convenient way to access all of Python's limited
API.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Is it Python 3 yet?

2017-01-26 Thread eryk sun
On Thu, Jan 26, 2017 at 10:49 PM, Paul Moore  wrote:
> On 26 January 2017 at 22:32, M.-A. Lemburg  wrote:
>> On 26.01.2017 23:09, Random832 wrote:
>>> On Thu, Jan 26, 2017, at 11:21, Paul Moore wrote:
 On a similar note, I always get caught out by the fact that the
 Windows default download is the 32-bit version. Are we not yet at a
 point where a sufficient majority of users have 64-bit machines, and
 32-bit should be seen as a "specialist" choice?
>>>
>>> I'm actually surprised it doesn't detect it, especially since it does
>>> detect Windows.
>>>
>>> (I bet fewer people have supported 32-bit windows versions than have
>>> Windows XP.)
>>
>> I think you have to differentiate a bit more between having a
>> 64-bit OS and running 64-bit applications.
>>
>> Many applications on Windows are still 32-bit applications and
>> unless you process large amounts of data, a 32-bit Python
>> system is well worth using. In some cases, it's even needed,
>> e.g. if you have to use an extension which links to a 32-bit
>> library.
>
> I agree that there are use cases for a 32-bit Python. But for the
> *average* user, I'd argue in favour of a 64-bit build as the default
> download.

Preferring the 64-bit version would be a friendlier experience for
novices in general nowadays. I've had to explain WOW64 file-system
redirection [1] and registry redirection [2] too many times to people
who are using 32-bit Python on 64-bit Windows. I've seen people waste
over a day on this silly problem. They can't imagine that Windows is
basically lying to them.

[1]: https://msdn.microsoft.com/en-us/library/aa384187
[2]: https://msdn.microsoft.com/en-us/library/aa384232
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Suggestion: Clear screen command for the REPL

2016-10-04 Thread eryk sun
On Tue, Oct 4, 2016 at 2:22 PM, Random832  wrote:
> On Wed, Sep 28, 2016, at 23:36, Chris Angelico wrote:
>> On Thu, Sep 29, 2016 at 12:04 PM, Steven D'Aprano 
>> wrote:
>> > (Also, it seems a shame that Ctrl-D is EOF in Linux and Mac, but Windows
>> > is Ctrl-Z + Return. Can that be standardized to Ctrl-D everywhere?)
>>
>> Sadly, I suspect not. If you're running in the default Windows
>> terminal emulator (the one a normal user will get by invoking
>> cmd.exe), you're running under a lot of restrictions, and I believe
>> one of them is that you can't get Ctrl-D without an enter.
>
> Well, we could read _everything_ in character-at-a-time mode, and
> implement our own line editing. In effect, that's what readline is
> doing.

3.6+ switched to calling ReadConsoleW, which allows using a 32-bit
control mask to indicate which ASCII control codes should terminate a
read. The control character is left in the input string, so it's
possible to define custom behavior for multiple control characters.
Here's a basic ctypes example of how this feature works. In each case,
after calling ReadConsoleW I enter "spam" and then type a control
character to terminate the read.

import sys
import msvcrt
import ctypes

kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
ReadConsoleW = kernel32.ReadConsoleW

CTRL_MASK = 2 ** 32 - 1 # all ctrl codes

hin = msvcrt.get_osfhandle(sys.stdin.fileno())
buf = (ctypes.c_wchar * 10)(*('-' * 10))
pn = (ctypes.c_ulong * 1)()
ctl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0)

>>> # Ctrl+2 or Ctrl+@ (i.e. NUL)
... ret = ReadConsoleW(hin, buf, 10, pn, ctl); print()
spam
>>> buf[:]
'spam\x00-'

>>> # Ctrl+D
... ret = ReadConsoleW(hin, buf, 10, pn, ctl); print()
spam
>>> buf[:]
'spam\x04-'

>>> # Ctrl+[
... ret = ReadConsoleW(hin, buf, 10, pn, ctl); print()
spam
>>> buf[:]
'spam\x1b-'

This could be used to implement Ctrl+D and Ctrl+L support in
PyOS_Readline. Supporting Ctrl+L to work like GNU readline wouldn't be
a trivial one-liner, but it's doable. It has to clear the screen and
also write the input (except the Ctrl+L) back to the input buffer.

> The main consequence of reading everything in character-at-a-time mode
> is that we'd have to implement everything ourselves, and the line
> editing you get *without* doing it yourself is somewhat nicer on Windows
> than on Linux (it supports cursor movement, inserting characters, and
> history).

Line-input mode also supports F7 for a history popup window to select
a previous command; Ctrl+F to search the screen text; text selection
(e.g. shift+arrows or Ctrl+A); copy/paste via Ctrl+C and Ctrl+V (or
Ctrl+Insert and Shift+Insert); and parameterized input aliases ($1-$9
and $* for parameters).

https://technet.microsoft.com/en-us/library/mt427362
https://technet.microsoft.com/en-us/library/cc753867

>> "Bash on Ubuntu on windows" responds to CTRL+D just fine. I don't really
>> know how it works, but it looks like it is based on the Windows terminal
>> emulator.
>
> It runs inside it, but it's using the "Windows Subsystem for Linux",
> which (I assume) reads character-at-a-time and feeds it to a Unix-like
> terminal driver, (which Bash then has incidentally also put in
> character-at-a-time mode by using readline - to see what you get on WSL
> *without* doing this, try running "cat" under bash.exe)

Let's take a look at how WSL modifies the console's global state.
Here's a simple function to print the console's input and output modes
and codepages, which we can call in the background to monitor the
console state:

def report():
hin = msvcrt.get_osfhandle(0)
hout = msvcrt.get_osfhandle(1)
modeIn = (ctypes.c_ulong * 1)()
modeOut = (ctypes.c_ulong * 1)()
kernel32.GetConsoleMode(hin, modeIn)
kernel32.GetConsoleMode(hout, modeOut)
cpIn = kernel32.GetConsoleCP()
cpOut = kernel32.GetConsoleOutputCP()
print('\nmodeIn=%x, modeOut=%x, cpIn=%d, cpOut=%d' %
  (modeIn[0], modeOut[0], cpIn, cpOut))

def monitor():
report()
t = threading.Timer(10, monitor, ())
t.start()

>>> monitor(); subprocess.call('bash.exe')

modeIn=f7, modeOut=3, cpIn=437, cpOut=437
...
modeIn=2d8, modeOut=f, cpIn=65001, cpOut=65001

See the following page for a description of the mode flags:

https://msdn.microsoft.com/en-us/library/ms686033

The output mode changed from 0x3 to 0xf, enabling

DISABLE_NEWLINE_AUTO_RETURN (0x8)
ENABLE_VIRTUAL_TERMINAL_PROCESSING (0x4)

The input mode changed from 0xf7 to 0x2d8, enabling

ENABLE_VIRTUAL_TERMINAL_INPUT (0x200)
ENABLE_WINDOW_INPUT (0x8, probably for SIGWINCH)

and disabling

ENABLE_INSERT_MODE (0x20)
ENABLE_ECHO_INPUT (0x4)
ENABLE_LINE_INPUT (0x2)
ENABLE_PROCESSED_INPUT (0x1)

So you're correct that it's basically using a 

Re: [Python-ideas] Suggestion: Clear screen command for the REPL

2016-09-29 Thread eryk sun
On Thu, Sep 29, 2016 at 7:08 AM, Stephan Houben  wrote:
>
> I just tried with this official Python binary:
> Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC v.1900 32 bit
> (Intel)] on win32
>
> and CTRL-L for sure does clear the window. It just doesn't then move the
> prompt to the top, so you end up with a bunch of empty lines, followed by
> the prompt.

You probably have pyreadline installed. It calls ReadConsoleInputW to
read low-level input records, bypassing the console's normal cooked
read. See the following file that defines the key binding:

https://github.com/pyreadline/pyreadline/blob/1.7/pyreadline/configuration/pyreadlineconfig.ini#L18

Unfortunately pyreadline is broken for non-ASCII input. It ignores the
Alt+Numpad record sequences used for non-ASCII characters.

Without having to implement readline module for Windows (personally, I
don't use it), support for Ctrl+L can be added relatively easily in
3.6+. ReadConsoleW takes a parameter to specify a mask of ASCII
control characters that terminate a read. The control character is
left in the buffer, so code just has to be written that looks for
various control characters to implement features such as a Ctrl+L
clear screen.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Suggestion: Clear screen command for the REPL

2016-09-19 Thread eryk sun
On Mon, Sep 19, 2016 at 1:12 PM, Paul Moore  wrote:
> By the way - if you're on a system with readline support included with
> Python, GNU readline apparently has a binding for clear-screen
> (CTRL-L) so you may well have this functionality already (I don;'t use
> Unix or readline, so I can't comment for sure).

Hooking Ctrl+L to clear the screen can be implemented for Windows
Vista and later via the ReadConsole pInputControl parameter, as called
by PyOS_StdioReadline. It should be possible to match how GNU readline
works -- i.e. clear the screen, reprint the prompt, flush the input
buffer, and write the current line's input back to the input buffer.

The pInputControl parameter can also be used to implement Unix-style
Ctrl+D to end a read anywhere on a line, whereas the classic
[Ctrl+Z][Enter] has to be entered at the start of a line.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Suggestion: Clear screen command for the REPL

2016-09-17 Thread eryk sun
On Sat, Sep 17, 2016 at 1:15 PM, Wes Turner  wrote:
>   !cls #windows

cmd's built-in cls command doesn't clear just the screen, like a VT100
\x1b[1J. It clears the console's entire scrollback buffer. Unix
`clear` may also work like that. With GNOME Terminal in Linux, `clear`
leaves a single screen in the scrollback buffer.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding optional parameter to shutil.rmtree to not delete root.

2016-08-24 Thread eryk sun
On Thu, Aug 25, 2016 at 2:29 AM, Nick Jacobson via Python-ideas
 wrote:
> I've been finding that a common scenario is where I want to remove
> everything in a directory, but leave the (empty) root directory behind, not
> removing it.
>
> So for example, if I have a directory C:\foo and it contains subdirectory
> C:\foo\bar and file C:\foo\myfile.txt, and I want to remove the subdirectory
> (and everything in it) and file, leaving only C:\foo behind.
>
> (This is useful e.g. when the root directory has special permissions, so it
> wouldn't be so simple to remove it and recreate it again.)

Here's a Windows workaround that clears the delete disposition after
rmtree 'deletes' the directory. A Windows file or directory absolutely
cannot be unlinked while there are handle or kernel references to it,
and a handle with DELETE access can set and unset the delete
disposition. This used to require the system call
NtSetInformationFile, but Vista added SetFileInformationByHandle to
the Windows API.

import contextlib
import ctypes
import _winapi

kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
kernel32.SetFileInformationByHandle # Vista minimum (NT 6.0+)

DELETE = 0x0001
SHARE_ALL = 7
OPEN_EXISTING = 3
BACKUP = 0x0200
FileDispositionInfo = 4

@contextlib.contextmanager
def protect_file(path):
hFile = _winapi.CreateFile(path, DELETE, SHARE_ALL, 0,
   OPEN_EXISTING, BACKUP, 0)
try:
yield
if not kernel32.SetFileInformationByHandle(
hFile, FileDispositionInfo,
(ctypes.c_ulong * 1)(0), 4):
raise ctypes.WinError(ctypes.get_last_error())
finally:
kernel32.CloseHandle(hFile)

For example:

>>> os.listdir('test')
['dir1', 'dir2', 'file']
>>> with protect_file('test'):
... shutil.rmtree('test')
...
>>> os.listdir('test')
[]

Another example:

>>> open('file', 'w').close()
>>> with protect_file('file'):
... os.remove('file')
...
>>> os.path.exists('file')
True
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] discontinue iterable strings

2016-08-21 Thread eryk sun
On Sun, Aug 21, 2016 at 6:34 AM, Michael Selik  wrote:
> The detection of not hashable via __hash__ set to None was necessary, but
> not desirable. Better to have never defined the method/attribute in the
> first place. Since __iter__ isn't present on ``object``, we're free to use
> the better technique of not defining __iter__ rather than defining it as
> None, NotImplemented, etc. This is superior, because we don't want __iter__
> to show up in a dir(), help(), or other tools.

The point is to be able to define __getitem__ without falling back on
the sequence iterator.

I wasn't aware of the recent commit that allows anti-registration of
__iter__. This is perfect:

>>> class C:
... __iter__ = None
... def __getitem__(self, index): return 42
...
   >>> iter(C())
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'C' object is not iterable
>>> isinstance(C(), collections.abc.Iterable)
False
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] discontinue iterable strings

2016-08-21 Thread eryk sun
On Sun, Aug 21, 2016 at 5:27 AM, Chris Angelico  wrote:
> Hmm. It would somehow need to be recognized as "not iterable". I'm not
> sure how this detection is done; is it based on the presence/absence
> of __iter__, or is it by calling that method and seeing what comes
> back? If the latter, then sure, an __iter__ that raises would cover
> that.

PyObject_GetIter calls __iter__ (i.e. tp_iter) if it's defined. To get
a TypeError, __iter__ can return an object that's not an iterator,
i.e. an object that doesn't have a __next__ method (i.e. tp_iternext).
For example:

>>> class C:
... def __iter__(self): return self
... def __getitem__(self, index): return 42
...
>>> iter(C())
Traceback (most recent call last):
  File "", line 1, in 
TypeError: iter() returned non-iterator of type 'C'

If __iter__ isn't defined but __getitem__ is defined, then
PySeqIter_New is called to get a sequence iterator.

>>> class D:
... def __getitem__(self, index): return 42
...
>>> it = iter(D())
>>> type(it)

>>> next(it)
42
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-19 Thread eryk sun
On Thu, Aug 18, 2016 at 3:25 PM, Steve Dower  wrote:
> allow us to change locale.getpreferredencoding() to utf-8 on Windows

_bootlocale.getpreferredencoding would need to be hard coded to return
'utf-8' on Windows. _locale._getdefaultlocale() itself shouldn't
return 'utf-8' as the encoding because the CRT doesn't allow it as a
locale encoding.

site.aliasmbcs() uses getpreferredencoding, so it will need to be
modified. The codecs module could add get_acp and get_oemcp functions
based on GetACP and GetOEMCP, returning for example 'cp1252' and
'cp850'. Then aliasmbcs could call get_acp.

Adding get_oemcp would also help with decoding output from
subprocess.Popen. There's been discussion about adding encoding and
errors options to Popen, and what the default should be. When writing
to a pipe or file, some programs use OEM, some use ANSI, some use the
console codepage if available, and far fewer use Unicode encodings.
Obviously it's better to specify the encoding in each case if you know
it.

Regarding the locale module, how about modernizing
_locale._getdefaultlocale to return the Windows locale name [1] from
GetUserDefaultLocaleName? For example, it could return a tuple such as
('en-UK', None) and ('uz-Latn-UZ', None) -- always with the encoding
set to None. The CRT accepts the new locale names, but it isn't quite
up to speed. It still sets a legacy locale when the locale string is
empty. In this case the high-level setlocale could call
_getdefaultlocale. Also _parse_localename, which is called by
getlocale, needs to return a tuple with the encoding as None.
Currently it raises a ValueError for Windows locale names as defined
by [1].

[1]: https://msdn.microsoft.com/en-us/library/dd373814
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-18 Thread eryk sun
On Thu, Aug 18, 2016 at 2:32 AM, Stephen J. Turnbull
 wrote:
>
> So it's not just invalid surrogate *pairs*, it's invalid surrogates of
> all kinds.  This means that it's theoretically possible (though I
> gather that it's unlikely in the extreme) for a real Windows filename
> to indistinguishable from one generated by Python's surrogateescape
> handler.

Absolutely if the filesystem is one of Microsoft's such as NTFS,
FAT32, exFAT, ReFS, NPFS (named pipes), MSFS (mailslots) -- and I'm
pretty sure it's also possible with CDFS and UDFS. UDF allows any
Unicode character except NUL.

> What happens when Python's directory manipulation functions on Windows
> encounter such a filename?  Do they try to write it to the disk
> directory?  Do they succeed?  Does that depend on surrogateescape?

Python allows these 'Unicode' (but not strictly UTF compatible)
strings, so it doesn't have a problem with such filenames, as long as
it's calling the Windows wide-character APIs.

> Is there a reason in practice to allow surrogateescape at all on names
> in Windows filesystems, at least when using the *W API?  You mention
> non-Microsoft filesystems; are they common enough to matter?

Previously I gave an example with a VirtualBox shared folder, which
rejects names with invalid surrogates. I don't know how common that is
in general. I typically switch between 2 guests on a Linux host and
share folders between systems. In Windows I mount shared folders as
directory symlinks in C:\Mount.

I just tested another example that led to different results. Ext2Fsd
is a free ext2/ext3 filesystem driver for Windows. I mounted an ext2
disk in Windows 10. Next, in Python I created a file named
"\udc00b\udc00a\udc00d" in the root directory. Ext2Fsd defaults to
using UTF-8 as the drive codepage, so I expected it to reject this
filename, just like VBoxSF does. But it worked:

>>> os.listdir('.')[-1]
'\udc00b\udc00a\udc00d'

As expected the ANSI API substitutes question marks for the surrogate codes:

>>> os.listdir(b'.')[-1]
b'?b?a?d'

So what did Ext2Fsd write in this supposedly UTF-8 filesystem? I
mounted the disk in Linux to check:

>>> os.listdir(b'.')[-1]
b'\xed\xb0\x80b\xed\xb0\x80a\xed\xb0\x80d'

It blindly encoded the surrogate codes, creating invalid UTF-8. I
think it's called WTF-8 (Wobbly Transformation Format). The file
manager in Linux displays this file as "���b���a���d (invalid
encoding)", and ls prints "???b???a???d". Python uses its
surrogateescape error handler:

>>> os.listdir('.')[-1]
'\udced\udcb0\udc80b\udced\udcb0\udc80a\udced\udcb0\udc80d'

The original name can be decoded using the surrogatepass error handler:

>>> os.listdir(b'.')[-1].decode(errors='surrogatepass')
'\udc00b\udc00a\udc00d'
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Fix default encodings on Windows

2016-08-16 Thread eryk sun
>> On Mon, Aug 15, 2016 at 6:26 PM, Steve Dower 
>> wrote:
>
> and using the *W APIs exclusively is the right way to go.

My proposal was to use the wide-character APIs, but transcoding CP_ACP
without best-fit characters and raising a warning whenever the default
character is used (e.g. substituting Katakana middle dot when creating
a file using a bytes path that has an invalid sequence in CP932). This
proposal was in response to the case made by Stephen Turnbull. If
using UTF-8 is getting such heavy pushback, I thought half a solution
was better than nothing, and it also sets up the infrastructure to
easily switch to UTF-8 if that idea eventually gains acceptance. It
could raise exceptions instead of warnings if that's preferred, since
bytes paths on Windows are already deprecated.

> *Any* encoding that may silently lose data is a problem, which basically
> leaves utf-16 as the only option. However, as that causes other problems,
> maybe we can accept the tradeoff of returning utf-8 and failing when a
> path contains invalid surrogate pairs

Are there any common sources of illegal UTF-16 surrogates in Windows
filenames? I see that WTF-8 (Wobbly) was developed to handle this
problem. A WTF-8 path would roundtrip back to the filesystem, but it
should only be used internally in a program.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-12 Thread eryk sun
On Thu, Aug 11, 2016 at 9:07 AM, Paul Moore  wrote:
> set codepage to UTF-8
> ...
> set codepage back
> spawn subprocess X, but don't wait for it
> set codepage to UTF-8
> ...
> ... At this point what codepage does Python see? What codepage does
> process X see? (Note that they are both sharing the same console).

The input and output codepages are global data in conhost.exe. They
aren't tracked for each attached process (unlike input history and
aliases). That's how chcp.com works in the first place. Otherwise its
calls to SetConsoleCP and SetConsoleOutputCP would be pointless.

But IMHO all talk of using codepage 65001 is a waste of time. I think
the trailing garbage output with this codepage in Windows 7 is
unacceptable. And getting EOF for non-ASCII input is a show stopper.
The problem occurs in conhost. All you get is the EOF result from
ReadFile/ReadConsoleA, so it can't be worked around. This kills the
REPL and raises EOFError for input(). ISTM the only people who think
codepage 65001 actually works are those using Windows 8+ who
occasionally need to print non-OEM text and never enter (or paste)
anything but ASCII text.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-10 Thread eryk sun
On Wed, Aug 10, 2016 at 11:30 PM, Random832  wrote:
> Er... utf-8 doesn't work reliably with arbitrary bytes paths either,
> unless you intend to use surrogateescape (which you could also do with
> mbcs).
>
> Is there any particular reason to expect all bytes paths in this
> scenario to be valid UTF-8?

The problem is more so that data is lost without an error when using
the legacy ANSI API. If the path is invalid UTF-8, Python will at
least raise an exception when decoding it. To work around this, the
developers may decide they need to just bite the bullet and use
Unicode, or maybe there could be legacy Latin-1 and ANSI modes enabled
by an environment variable or sys flag.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-10 Thread eryk sun
On Wed, Aug 10, 2016 at 8:09 PM, Random832  wrote:
> On Wed, Aug 10, 2016, at 15:22, Steve Dower wrote:
>>
>> Allowing library developers who support POSIX and Windows to just use
>> bytes everywhere to represent paths.
>
> Okay, how is that use case impacted by it being mbcs instead of utf-8?

Using 'mbcs' doesn't work reliably with arbitrary bytes paths in
locales that use a DBCS codepage such as 932. If a sequence is
invalid, it gets passed to the filesystem as the default Unicode
character, so it won't successfully roundtrip. In the following
example b"\x81\xad", which isn't defined in CP932, gets mapped to the
codepage's default Unicode character, Katakana middle dot, which
encodes back as b"\x81E":

>>> locale.getpreferredencoding()
'cp932'
>>> open(b'\x81\xad', 'w').close()
>>> os.listdir('.')
['・']
>>> unicodedata.name(os.listdir('.')[0])
'KATAKANA MIDDLE DOT'
>>> '・'.encode('932')
b'\x81E'

This isn't a problem for single-byte codepages, since every byte value
uniquely maps to a Unicode code point, even if it's simply b'\x81' =>
u"\x81". Obviously there's still the general problem of dealing with
arbitrary Unicode filenames created by other programs, since the ANSI
API can only return a best-fit encoding of the filename, which is
useless for actually accessing the file.

>> It probably also entails opening the file descriptor in bytes mode,
>> which might break programs that pass the fd directly to CRT functions.
>> Personally I wish they wouldn't, but it's too late to stop them now.
>
> The only thing O_TEXT does rather than O_BINARY is convert CRLF line
> endings (and maybe end on ^Z), and I don't think we even expose the
> constants for the CRT's unicode modes.

Python 3 uses O_BINARY when opening files, unless you explicitly call
os.open. Specifically, FileIO.__init__ adds O_BINARY to the open flags
if the platform defines it.

The Windows CRT reads the BOM for the Unicode modes O_WTEXT,
O_U16TEXT, and O_U8TEXT. For O_APPEND | O_WRONLY mode, this requires
opening the file twice, the first time with read access. See
configure_text_mode() in "Windows
Kits\10\Source\10.0.10586.0\ucrt\lowio\open.cpp".

Python doesn't expose or use these Unicode text-mode constants. That's
for the best because in Unicode mode the CRT invokes the invalid
parameter handler when a buffer doesn't have an even number of bytes,
i.e. a multiple of sizeof(wchar_t). Python could copy how
configure_text_mode() handles the BOM, except it shouldn't write a BOM
for new UTF-8 files.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/