[Python-ideas] Re: Please update shutil.py copyfileobj to include code mentioned in below issues
On 6/8/23, jsch...@sbcglobal.net wrote: > I opened two issues regarding copyfileobj that were not bugs, but a fix that > was involved helped me figure out I needed a new external drive, since it > displayed the error number from the copyfileobj function. I'd like a > modified version of this code implemented permanently in shutil.py so others > could see if they have the same issue as me. I don't know how many people still subscribe to and read this mailing list. More people would see this suggestion if you posted this on discuss.python.org/c/ideas. > This is the original issue that has the code I was using that Eryksun > posted. > > https://github.com/python/cpython/issues/96721 > > Here's the second issue where it happened again. I put the error message in > this post, so you can see how it helped me. Also, the code might need to be > modified slightly, since it generated an error. > > https://github.com/python/cpython/issues/102357 The ctypes code that I provided was only for debugging purposes. Python needs to support the C runtime's _doserrno value (actually it's a Windows error code) internally for I/O calls such as _wopen(), close(), read(), and write(). Also, the error that you encountered, ERROR_NO_SUCH_DEVICE (433), should be mapped to the C errno value ENOENT (i.e. FileNotFoundError) in PC/errmap.h. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KVIYVVZCHCOYLOFQNJDNILCF7KQVBR6A/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)
On 11/7/22, Eryk Sun wrote: > > def isjunction(path): > """Test whether a path is a junction. > """ > try: > st = os.lstat(path) > except (OSError, ValueError, AttributeError): > return False > return bool(st.st_reparse_tag & stat.IO_REPARSE_TAG_MOUNT_POINT) The bitwise AND check in the above is wrong. It should check whether the tag *equals* IO_REPARSE_TAG_MOUNT_POINT. Sorry, this was an editing mistake when I simplified the expression to remove a redundant check of st_file_attributes. This idea is being developed for Python 3.12: https://github.com/python/cpython/issues/99547 https://github.com/python/cpython/pull/99548 ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/GEUZJE2Q2ACFGSMPHWPO5437CXRRNAZ3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)
On 11/8/22, Charles Machalow wrote: > > Funny enough in PowerShell, for prints an "l" for both symlinks and > junctions.. so it kind of thinks of it as a link of some sort too I guess. As does Python already in many cases. For example, os.lstat() doesn't traverse a mount point (junction). On Windows, symlinks and mount points are in a general category of name-surrogate reparse points. os.lstat() doesn't traverse them. If Python supported copying a mount point via os.symlink(os.readlink(src), dst), I'd be reluctantly in favor of just letting ntpath.islink() return true for a mount point, as a practical measure for seamless cross-platform implementations of functions like rmtree() and copytree(). In terms of POSIX that's nonsense, but not really on Windows. > Is it that much of a waste to just return False on posix? I mean it's a > couple lines and just maintains api.. and in theory can be more clear to > some. I'm just thinking this through in terms of conceptual cost and usefulness in the standard library relative to how easy it is to implement one's own isjunction() or is_name_surrogate() test. Of course, a lot of the os.path tests have simple implementations, such as exists(), isdir() and isfile(). They're in the standard library because they're commonly needed. The question is whether isjunction() is needed enough generally to justify adding it. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/G4YQTXFPDN5YQLNYUUKCP2NV4DLGWSTN/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)
On 11/8/22, Charles Machalow wrote: > I tend to prefer adding isjunction instead of changing ismount since I tend > to not think about junctions as being mounts (but closer to symlinks).. Junctions are mount points that are similar to Unix bind mounts where it counts -- in the behavior that's implemented for them in the kernel. This behavior isn't exclusive to just volume mount points. It's implemented the same for all junctions, and it's distinctly different from symlinks. There are times that I want to handle non-root mount points as if they're symlinks, such as deleting them in rmtree(). There are times where I want to handle them distinctly from symlinks, such as adding code in copytree() to copy a junction. > I guess either way the closeness of the concepts is a different story than > the specific ask here. In other words: for clarity, adding a specific > method makes the most sense to me. Adding a posixpath.isjunction() function that's always false seems a waste compared to common support for os.path.ismount(). On the other hand, the realpath() call in posixpath.ismount() is expensive, so calling os.path.ismount() to decide how to handle a directory would be expensive on POSIX. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KC5UNZRMTL6AUYOLJG7A4VV2LIJAVN6V/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)
On 11/7/22, Charles Machalow wrote: > So would you be for specific methods to check if a given path is a > junction? I'd prefer for ismount() to be modified to always return true for a junction. This would be a significant rewrite of the current implementation, which is only true for a junction that targets a system volume mount point (i.e. "\\?\Volume{GUID}\"). Of course ismount() wouldn't be true for only junctions. It's also be true for the root path of any drive, device, or UNC share if it's an existing filesystem directory. Implementing a function that checks for only a junction is simple enough. For example: def isjunction(path): """Test whether a path is a junction. """ try: st = os.lstat(path) except (OSError, ValueError, AttributeError): return False return bool(st.st_reparse_tag & stat.IO_REPARSE_TAG_MOUNT_POINT) To be completely certain, sometimes st_file_attributes is also checked for stat.FILE_ATTRIBUTE_REPARSE_POINT. But a filesystem that sets a reparse point on a directory without also setting the latter file attribute would be dysfunctional. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SE6ZHNRQ44D72ZPVCGTLNFSKVX5SAGXP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)
On 11/7/22, Charles Machalow wrote: > > Junctions are contextually similar to symlinks on Windows. Junctions (i.e. IO_REPARSE_TAG_MOUNT_POINT) are implemented to behave as mount points for local volumes, so there are a couple of important differences. In a remote path, a junction gets resolved on the server side, which is always possible because the target of a junction must be a local volume (i.e. local to the server). Thus a junction that targets "C:\spam" resolves to the "C:" drive on the remote system. If you're resolving a junction manually via `os.readlink()`, take care to never resolve a remote junction target as a local path such as "C:\spam". That would not only be wrong but also potentially harmful if client files get mistakenly modified, replaced, or deleted. On the other hand, a remote symlink that targets "C:\spam" gets resolved by the client and thus always resolves to the local "C:" drive of the client. This depends on the client system allowing remote-to-local (R2L) symlinks, which is disabled by default for good reason. When resolving a symlink manually, at worst you'll be in violation of the system's L2L, L2R, R2L, or R2R symlink policy. Secondly, the target of a junction does not replace the previously traversed path when the system parses a path. This affects how a relative symlink gets resolved, in which case traversed junctions behave like Unix bind mount points. Say that "E:\eggs\spamlink" is a relative symlink that targets "..\spam". When accessed directly, this symbolic link resolves to "E:\spam". Say that "C:\mount\junction" targets "E:\eggs". Then "C:\mount\junction\spamlink" resolves to "C:\mount\spam", a different file in this case. In contrast, the target of a symlink always replaces the traversed path when the system parse a path. Say that "C:\mount\symlink" targets "E:\eggs". Then "C:\mount\symlink\spamlink" resolves to "E:\spam", the same as if "E:\eggs\spamlink" had been opened directly. > Currently is_symlink/islink return False for junctions. Some API contexts, libraries, and applications only support IO_REPARSE_POINT_SYMLINK reparse points as symlinks. For general compatibility that's the only type of reparse point that reliably counts as a "symlink". Also, part of the rationale for this division is that currently we cannot copy a junction via os.readlink() and os.symlink(). If we were to copy a junction as a symlink, in general this could change how the target path is resolved or how the link behaves in the context of relative symlinks. It would be less of an issue if os.readlink() returned an object type that allowed duplicating any name-surrogate reparse point via os.symlink(). Instead of calling WinAPI CreateSymbolicLinkW() in such cases, os.symlink() would create the target file/directory and directly set the reparse point via FSCTL_SET_REPARSE_POINT. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YM6EUE6HSH7QJISUXH3J24C4OSAN7JLR/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add copy to pathlib
On 10/18/22, Todd wrote: > > So I think it would make a lot of sense to include copying inside pathlib. > I propose adding a `copy` method to `pathlib.Path` (for concrete paths). > > The specific call signature would be: > > copy(dst, *, follow_symlinks=True, recursive=True, dir_exist_ok=True) > > This will call `shutil.copytree` for directories if recursive is True, or > `copy2` if recursive if False. For files it will call `copy2` always. FYI, Barney Gale also proposed implementing copy() and copytree() methods recently. Barney is working on a significant restructuring of pathlib. https://discuss.python.org/t/incrementally-move-high-level-path-operations-from-shutil-to-pathlib/19208 ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VLZ52HC6625KYESUHP6UNLUAD4FIXZC4/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add copy to pathlib
On 10/18/22, Todd wrote: > > How is it any less of a "path operation" than moving files, reading and > writing files, making directories, and deleting files? Path-related operations involve creating, linking, symlinking, and listing directories and files, and peripherally also accessing file metadata such as size, timestamps, attributes, and permissions (i.e. filesystem indexing and bookkeeping). Reading and writing are I/O data operations on the contents of files. Copying a file is a path operation in that a new file gets created in the filesystem, but it's primarily an I/O operation, as are the read_text(), read_bytes(), write_text() and write_bytes() methods of Path objects. The ship sailed a long time ago. Path objects support I/O. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PPKSABYX2XUDNFJTNBJWBFFBPFFJJEDP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Use 'bin' in virtual environments on Windows
On 7/24/22, Barry Scott wrote: > >> On 21 Jul 2022, at 16:42, Christopher Barker wrote: > >> However, I’m no Windows expert, but I *think* the modern Windows file >> system(s?) support something like symlinks. It’s an under-the-hood >> feature, but maybe it’s possible to add a symlink for bin. > > It has symlinks but only available if you are administrator. Creating symlinks requires the filesystem to support NT reparse points. That's guaranteed for the system volume, which must be NTFS, but it's unreliable when development is spread across various filesystems. This is the main obstacle to relying on a "Scripts" -> "bin" link. It's not technically correct to state that creating symlinks requires administrator access. It requires SeCreateSymbolicLinkPrivilege, or no privilege at all if developer mode is enabled for the system in Windows 10+. By default this privilege is granted to just the administrators group. However, an administrator can grant it to any user or group. I prefer to grant it to the "Authenticated Users" group. If creating a directory symlink isn't allowed, and the filesystem supports reparse points, then a junction mount point can be created instead. In Unix terms, this is like using a bind mount instead of a symlink. In Windows, creating a mount point doesn't require any privilege or special access. (Registering it with the mount-point manager requires administrator access, but that's only done for volume mount points, as created by SetVolumeMountPointW.) ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AS7REUDGBW4QGQPXTNCMGMC4L2ZSUUXI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: TextIOBase: Make tell() and seek() pythonic
On 5/26/22, Steven D'Aprano wrote: > > If you seek() to position 4, say, the results will be unpredictable but > probably not anything good. > > In other words, the tell() and seek() cookies represent file positions > in **bytes**, even though we are reading or writing a text file. To clarify the general context, text I/O tell() and seek() cookies aren't necessarily just a byte offset. They can be packed integers that include a start position, decoder flags, a number of bytes to be fed into the decoder, whether the decode operation should be final (EOF), and the number of decoded characters (ordinals) to skip. For example: >>> open('spam.txt', 'w', encoding='utf-7').write('\u0100'*4) 4 >>> f = open('spam.txt', encoding='utf-7') >>> f.read(2) 'ĀĀ' >>> f.tell() 68056473487184303961218560357960280 >>> start_pos, dec_flags, bytes_to_feed, need_eof, chars_to_skip = ( ... _pyio.TextIOWrapper._unpack_cookie(..., f.tell())) >>> start_pos, dec_flags, bytes_to_feed, need_eof, chars_to_skip (0, 55834574848, 2, False, 0) ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NLU47DADXVIPBJLGP4IPLPKYBWH7DN7F/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: TextIOBase: Make tell() and seek() pythonic
On 5/26/22, Christopher Barker wrote: > IIRC, there were two builds- 16 and 32 bit Unicode. But it wasn’t UTF16, it > was UCS-2. In the old implementation prior to 3.3, narrow and wide builds were supported regardless of the size of wchar_t. For a narrow build, if wchar_t was 32-bit, then PyUnicode_FromWideChar() would encode non-BMP ordinals as UTF-16 surrogate pairs, and PyUnicode_AsWideChar() implemented the reverse, from UTF-16 back to UTF-32. There were several similar cases, such as PyUnicode_FromOrdinal(). The header called this "limited" UTF-16 support, primarily I suppose because the length of strings and indexing failed to account for surrogate pairs. For example: >>> s = '\U0001' >>> len(s) 2 >>> s[0] '\ud800' >>> s[1] '\udc00' Here's a link to the old implementation: https://github.com/python/cpython/blob/v3.2.6/Objects/unicodeobject.c ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ATPNS7CEQUONIWDXFCQEEUUGJBOJV72L/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Custom literals, a la C++
On 4/11/22, Chris Angelico wrote: > > Which raises the question: what if the current directory no longer has > a path name? Or is that simply not possible on Windows? The process working directory is opened without FILE_SHARE_DELETE sharing. This prevents opening the directory with DELETE access from any security context in user mode, even by the SYSTEM account. If the handle for the working directory is forcefully closed (e.g. via Process Explorer) and the directory is deleted, then accessing a relative path in the affected process fails with ERROR_INVALID_HANDLE (6) until the working directory is changed to a valid directory. > (Don't even get me started on prefixing paths with \\?\ and what that > changes. Windows has bizarre backward compatibility constraints.) Paths prefixed by \\?\ or \\.\ are not supported for the process working directory and should not be used in this case. The Windows API is buggy if the working directory is set to a prefixed path. For example, it fails to identify a drive such as r"\\?\C:" or r"\\?\UNC\server\share" in the working directory, in which case a rooted path such as r"\spam" can't be accessed. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5A4RYPI6T7FHGRP7KOEL2ISQHHNUPLCJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Custom literals, a la C++
On 4/11/22, Chris Angelico wrote: > > If you say `open("/spam")`, Windows uses "default drive" + "explicit > directory". You can think of a default drive as being the drive of the current working directory, but there is no "default drive" per se that's stored separate from the working directory. Python and most other filesystem libraries generalize a UNC "\\server\share" path as a 'drive', in addition to drive-letter drives such as "Z:". However, the working directory is only remembered separately from the process working directory in the case of drive-letter drives, not UNC shares. If the working directory is r"\\server\share\foo\bar", then r"\spam" resolves to r"\\server\share\spam". If the working directory is r"\\server\share\foo\bar", then "spam" resolves to r"\\server\share\foo\bar\spam". However, the system will actually access this path relative to an open handle for the working directory. A handle for the process working directory is always kept open and thus protected from being renamed or deleted. Per-drive working directories are not kept open. They're just stored as path names in reserved environment variables. > Hence there are 26 current directories (one per drive), plus the > selection of current drive, which effectively chooses your current > directory. If the process working directory is a DOS drive path, then 26 working directories are possible. If the process working directory is a UNC path, then 27 working directories are possible. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OR65GYLNYOV4LT3ZEM3YFIVHSOP3D664/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Custom literals, a la C++
On 4/11/22, Steven D'Aprano wrote: > > How does that work in practice? In Windows, if you just say the > equivalent to `open('spam')`, how does the OS know which drive > and WD to use? "spam" is resolved against the process working directory, which could be a UNC path instead of a drive. OTOH, "Z:spam" is relative to the working directory on drive "Z:". If the latter is r"Z:\foo\bar", then "Z:spam" resolves to r"Z:\foo\bar\spam". The working directory on a drive gets set via os.chdir() when the process working directory is set to a path on the drive. It's implemented via reserved environment variables with names that begin with "=", such as "=Z:" set to r"Z:\foo\bar". Python's os.environ doesn't support getting or setting these variables, but WinAPI GetEnvironmentVariableW() and SetEnvironmentVariableW() do. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ULS4MZZNF6MIUEGGRF5GIJ2PSJJOUGYL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Custom literals, a la C++
On 4/11/22, Steven D'Aprano wrote: > > You know how every OS process has its own working directory? Just like > that, except every module. A per-thread working directory makes more sense to me. But it would be a lot of work to implement support for this in the os and io modules, for very little gain. > "One WD per process" is baked so deep into file I/O on Posix > systems (and I presume Windows) that its probably impossible to > implement in current systems. Windows has up to 27 working directories per process. There's the overall working directory directory, plus one for each drive. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IJFHA3HTOHEANOXD34KSK7TYDHZYULWA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Missing expandvars equivalent in pathlib
On 2/13/22, Eric Fahlgren wrote: > > That may or may not work as Windows has inconsistent treatment of multiple > separators depending on where they appear in a path. If TEMP is a drive > spec, say "t:\", then it expands to "t:\\spam.csv", which is an invalid > windows path. If TEMP is a directory spec, "c:\temp\", then it expands to > "c:\temp\\spam.csv", which works fine. > > C:\> dir c:\\temp\junk > The filename, directory name, or volume label syntax is incorrect. "c:\\temp\junk" isn't always invalid in CMD, and definitely not in the Windows API. The problem occurs because the DIR command in the CMD shell has legacy support to ignore the drive (e.g. "C:") when the root of the path is exactly two backslashes -- because DOS in the 1980s (i.e. they went out of their to add this behavior in CMD to make it compatible with DOS). To see this, check the "C$" administrative share on "localhost": C:\>dir /b C:\\localhost\C$\Temp\spam.txt File Not Found C:\>echo spam >C:\\Temp\spam.txt C:\>dir /b C:\\localhost\C$\Temp\spam.txt spam.txt Even though using two backslashes for the root of a drive path is allowed in the Windows API itself, it's sill problematic. The path part r"\\path\to\file" can't be used as relative to the current drive of the process because it's always a UNC absolute path. So it should be normalized to r"\path\to\file" as soon as possible, e.g. via GetFullPathNameW(): >>> print(nt._getfullpathname(r'C:\\path\to\file')) C:\path\to\file ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NAELNA54K5RDB23CM4MVGXRN7PBPNVYT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Missing expandvars equivalent in pathlib
On 2/13/22, Paul Moore wrote: > > For better or worse, though, Windows (as an OS) doesn't have a "normal > behaviour". %-expansion is a feature of CMD and .bat files, which You're overlooking ExpandEnvironmentStringsW() [1], ExpandEnvironmentStringsForUserW(), and PathUnExpandEnvStringsW() [2], which provide basic support for `%` based environment variables in strings. Python's standard library supports winreg.ExpandEnvironmentStrings(). It is critical that the system supports this functionality in order to evaluate REG_EXPAND_SZ values in the registry. [1] https://docs.microsoft.com/en-us/windows/win32/api/processenv/nf-processenv-expandenvironmentstringsw [2] https://docs.microsoft.com/en-us/windows/win32/api/shlwapi/nf-shlwapi-pathunexpandenvstringsw ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SNZVO3OAF5CZFALNQN6XIQRCJVN2NZ75/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Please consider mentioning property without setter when an attribute can't be set
On 2/13/22, Christopher Barker wrote: > > Telling newbies that that means that it's either a property with no setter, > or am object without a __dict__, or one with __slots__ defined is not > really very helpful. The __slots__ case is due to the lack of a __dict__ slot. It can be manually added in __slots__ (though adding __dict__ back is uncommon), along with the __weakref__ slot. The exception message when there's no __dict__ is generally good enough. For example: >>> (1).x = None Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute 'x' It's clear that the object has no __dict__ and no descriptor named "x". However, the message gets confusing with partially implemented magic attributes. For example, implement __getattr__(), but not __setattr__() or __delattr__(): class C: __slots__ = () def __getattr__(self, name): class_name = self.__class__.__name__ if name == 'x': return 42 raise AttributeError(f'{class_name!r} object has no ' f'attribute {name!r}') >>> c = C() >>> c.x 42 >>> c.x = None Traceback (most recent call last): File "", line 1, in AttributeError: 'C' object has no attribute 'x' Add __setattr__(): def __setattr__(self, name, value): class_name = self.__class__.__name__ if name == 'x': raise AttributeError(f'attribute {name!r} of {class_name!r} ' 'objects is not writable') raise AttributeError(f'{class_name!r} object has no ' f'attribute {name!r}') >>> c = C() >>> c.x = None Traceback (most recent call last): File "", line 1, in File "", line 12, in __setattr__ AttributeError: attribute 'x' of 'C' objects is not writable >>> del c.x Traceback (most recent call last): File "", line 1, in AttributeError: 'C' object has no attribute 'x' Add __delattr__(): def __delattr__(self, name): class_name = self.__class__.__name__ if name == 'x': raise AttributeError(f'attribute {name!r} of {class_name!r} ' 'objects is not writable') raise AttributeError(f'{class_name!r} object has no ' f'attribute {name!r}') >>> c = C() >>> c.x 42 >>> c.x = None Traceback (most recent call last): File "", line 1, in File "", line 12, in __setattr__ AttributeError: attribute 'x' of 'C' objects is not writable >>> del c.x Traceback (most recent call last): File "", line 1, in File "", line 19, in __delattr__ AttributeError: attribute 'x' of 'C' objects is not writable ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3S2KW3O7O7KKBQD2FVW6NG3CISNHF745/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Please consider mentioning property without setter when an attribute can't be set
On 2/11/22, Paul Moore wrote: > > I'm inclined to say just raise an issue on bpo. If it's easy enough, > it'll just get done. If it's hard, having lots of people support the > idea won't make it any easier. I don't think this is something that > particularly needs evidence of community support before asking for it. The error message is in property_descr_set() in Objects/descrobject.c. I agree that it should state that the attribute is a property. Python developers know that a property requires a getter, setter, and deleter method in order to function like a regular, mutable attribute. If not, help(property) explains it all clearly. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WBQZMFZT4KCXKEUEND4BNAZFAAUA7HA2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: os.workdir() context manager
On 9/15/21, Paul Moore wrote: > > Just a somewhat off-topic note, but dir_fd arguments are only > supported on Unix, and the functionality only appears to be present at > the NT Kernel level on Windows, not in the Windows API. Handle-relative paths are supported by all NT system calls that access object paths, but NT doesn't support ".." components. Normal user-mode programs can make system calls directly (e.g. call NtCreateFile instead of CreateFile), but even if Python bypassed the Windows API to support dir_fd, the lack of support for ".." components in relative paths would be an annoying inconsistency with POSIX dir_fd support. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QO253Y4XOMJZC7YQQRZPYME353M7WDDA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Integer concatenation to byte string
On 3/1/21, mmax42...@gmail.com wrote: > And there is no way to make a mutable bytes object without a function call. Since a code object is immutable, the proposed bytearray display form would still require an internal operation that constructs a bytearray from a bytes object. For example, something like the following: BUILD_BYTEARRAY 0 LOAD_CONST 0 (b'spam') BYTEARRAY_EXTEND 1 > I propose an array-type string like the, or for the bytearray. It would work > as a mutable b-string, as > > foo = a"\x00\x01\x02abcÿ" # a-string, a mutable bytes object. > foo[0] = 123 # Item assignment > foo+= 255 # Works the same as Concatenating a sequence with a number shouldn't be allowed. OTOH, I think `foo += [255]` should be supported as foo.extend([255]), but bytearray doesn't allow it currently. `foo.append(255)` is supported. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BRYZHZAVNCZYYOVSKOKLZMATASMU4WH6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/11/21, M.-A. Lemburg wrote: > On 11.02.2021 13:49, Eryk Sun wrote: > >> Currently, locale.getpreferredencoding(False) is implemented as >> locale._get_locale_encoding(). This ultimately calls >> _Py_GetLocaleEncoding(), defined in "Python/fileutils.c". >> TextIOWrapper() calls this C function to get the encoding to use when >> encoding=None is passed. > > All that seems to be new in Python 3.10. This is not what's > happening in Python 3.9. The _get_locale_encoding() function > doesn't even exist. In previous versions, locale.getpreferredencoding(False) is functionally the same. In 3.10, the latter is implemented in C via locale._get_locale_encoding(). > Why an env variable ? You could simply open up a ticket to get this > fixed, since 3.10 is not released yet. I thought it would be best to let users/administrators opt in to POSIX behavior. But maybe it should require opting out. >>>> getlocale(LC_CTYPE) > ('en_US', 'ISO8859-1') >>>> getlocale(LC_CTYPE) > ('el_GR', 'ISO8859-7') Windows code pages 1252 and 1253 are not the same as ISO-8859-1 and ISO-8859-7. getlocale() is just looking up the encoding of "en_US" and "el_GR" from the mapping in the locale module. That kind of best-guess result isn't right for locale._get_locale_encoding(). > The returned values for the encoding look mostly correct to > me, except the one for the 'C' locale which should be 'ascii'. The "C" locale in the Windows CRT uses Latin-1 for LC_CTYPE. This is implemented for mbstowcs() by casting from char to wchar_t. It's similar for wcstombs(), and limited to Unicode ordinals below 256. However, the "C" locale isn't consistently Latin-1 across other categories. IIRC, LC_TIME in the "C" locale uses the process ANSI code page for time-zone names, and mojibake is common. > Anyway, UTF-8 mode is the way to go these days, esp. if you want > to write applications which are portable across platforms and > behave the same on all. Globally setting PYTHONUTF8 forces all scripts to use UTF-8 as the default for open(). I'd like to let scripts opt in to using UTF-8 as the default for open() by way of an explicit setlocale() call such as setlocale(LC_CTYPE, (getdefaultlocale()[0], "UTF-8")) or, Windows only, setlocale(LC_CTYPE, ".UTF-8"). In POSIX, Python already tries coercing the "C" and "POSIX" locales (usually ASCII) to use UTF-8. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/A6HOUXS4E2LFCSZA4RTJ3OE6ZXHRVAQF/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/11/21, M.-A. Lemburg wrote: > I think the main problem here is that open() doesn't use > locale.getlocale()[1] as default for the encoding parameter, > but instead locale.getpreferredencoding(False). Currently, locale.getpreferredencoding(False) is implemented as locale._get_locale_encoding(). This ultimately calls _Py_GetLocaleEncoding(), defined in "Python/fileutils.c". TextIOWrapper() calls this C function to get the encoding to use when encoding=None is passed. In POSIX, _Py_GetLocaleEncoding() calls nl_langinfo(CODESET), which returns the current LC_CTYPE encoding, not the default LC_CTYPE encoding. For example, in Linux: >>> setlocale(LC_CTYPE, 'en_US.UTF-8') 'en_US.UTF-8' >>> _get_locale_encoding() 'UTF-8' >>> open('test.txt').encoding 'UTF-8' >>> setlocale(LC_CTYPE, 'en_US.ISO-8859-1') 'en_US.ISO-8859-1' >>> _get_locale_encoding() 'ISO-8859-1' >>> open('test.txt').encoding 'ISO-8859-1' In Windows, _Py_GetLocaleEncoding() just uses GetACP(), which returns the process ANSI code page. This is based on the CRT's default locale set by setlocale(LC_CTYPE, ""), which combines the user's default locale with the process ANSI code page. I'm not overjoyed about this combination in the default locale, since it's potentially inconsistent (e.g. Korean user locale with Latin 1252 process code page), but that ship sailed a long time ago. I'm not arguing to change locale.getdefaultlocale(). The problem is that locale._get_locale_encoding() in Windows is not returning the current LC_CTYPE locale encoding, in contrast to how it behaves in POSIX. I'd like an environment variable and/or -X option to fix this flaw. If enabled, and if the C runtime supports UTF-8 locales (as it has for the past 3 years in Windows 10), and the application warrants it (e.g. many open calls across many modules), then convenient use of UTF-8 would be one setlocale() call away. It's not for packages. Frankly, I don't see why it's a problem for a package developer to use encoding='utf-8' for files that need to use UTF-8. Developing libraries that are designed to work in arbitrary applications on multiple platforms is tedious work. Having to explicitly pass encoding='utf-8' goes with the territory, and it's a minor annoyance in the grand scheme of things. > That's what getlocale(LC_CTYPE) is intended for, unless I'm > missing something. getlocale() can't be relied on to parse the correct codeset from the locale name, and it can even raise ValueError (more likely in Windows, e.g. with the native locale name "en-US"). The codeset should be queried directly using an API call, such as nl_langinfo(CODESET) in POSIX. In Windows, the C runtime's POSIX locale implementation doesn't include nl_langinfo(). There's ___lc_codepage_func(), but it's documented as an internal function. A ucrt locale record, however, does expose the code page as a public field, as documented in the public header "corecrt.h". Here's a prototype using ctypes: import os import ctypes ucrt = ctypes.CDLL('ucrtbase', use_errno=True) class _crt_locale_data_public(ctypes.Structure): _fields_ = (('_locale_pctype', ctypes.POINTER(ctypes.c_ushort)), ('_locale_mb_cur_max', ctypes.c_int), ('_locale_lc_codepage', ctypes.c_uint)) class _crt_locale_pointers(ctypes.Structure): _fields_ = (('locinfo', ctypes.POINTER(_crt_locale_data_public)), ('mbcinfo', ctypes.c_void_p)) ucrt._get_current_locale.restype = ctypes.POINTER(_crt_locale_pointers) CP_UTF8 = 65001 def _get_locale_encoding(): locale = ucrt._get_current_locale() if not locale: errno = ctypes.get_errno() raise OSError(errno, os.strerror(errno)) try: codepage = locale[0].locinfo[0]._locale_lc_codepage finally: ucrt._free_locale(locale) if codepage == 0: return 'latin-1' # "C" locale if codepage == CP_UTF8: return 'utf-8' return f'cp{cp}' Examples with Python 3.9 in Windows 10: >>> setlocale(LC_CTYPE, 'C') 'C' >>> _get_locale_encoding() 'latin-1' >>> setlocale(LC_CTYPE, 'en_US') 'en_US' >>> _get_locale_encoding() 'cp1252' >>> setlocale(LC_CTYPE, 'el_GR') 'el_GR' >>> _get_locale_encoding() 'cp1253' >>> setlocale(LC_CTYPE, 'en_US.utf-8') 'en_US.utf-8' >>> _get_locale_encoding() 'utf-8' ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OQJBNUKMKH6CHGJKFM6H6SCOEIYECLSU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/11/21, Christopher Barker wrote: > On Wed, Feb 10, 2021 at 12:33 AM Paul Moore wrote: > >> So get PYTHONUTF8 added to the environment activate script. That's a >> simple change to venv. And virtualenv, and conda > > That's probably a good solution for venv and virtualenv -- essentially add > it as another environment creation option. Note that using a virtual environment does not require activation. A script can be deployed to run in a virtual environment by referring to the environment's executable in a shebang line, e.g.: #!path\to\venv\Scripts\python.exe Or with a Windows shell link that runs path\to\venv\Scripts\python.exe path\to\script.py Setting PYTHONUTF8 in the activate script does nothing to educate users about the default encoding in other contexts. The REPL shell could print a short message at startup that informs the user that Python is using UTF-8 mode, including a link to a web page that explains this in more detail. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SN4HJZFL3CXOJS53DUTQDRQ4MCXRLERT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/10/21, M.-A. Lemburg wrote: > > setx PYTHONUTF8 1 > > does the trick in an admin command shell on Windows globally. The above command sets the variable only for the current user, which I'd recommend anyway. It does not require administrator access. To set a machine value, run `setx /M PYTHONUTF8 1`, which of course requires administrator access. Also, run `set PYTHONUTF8=1` in CMD or `$env:PYTHONUTF8=1` in PowerShell to set the variable in the current shell. Unrelated to UTF-8 mode and long-term plans to make UTF-8 the preferred encoding, what I want, from the perspective of writing applications and scripts (not libraries), is a -X option and/or environment variable to make local._get_locale_encoding() behave like it does in POSIX. It should return the LC_CTYPE codeset of the current locale, not just the default locale. This would allow setlocale() in Windows to change the default for encoding=None, just as it does in POSIX. Technically it's not hard to implement in a way that's as reliable as nl_langinfo(CODESET) in POSIX. The code page of the current CRT locale is a public field. In Windows 10 the CRT has supported UTF-8 for 3 years -- regardless of the process active code page returned by GetACP(). Just call setlocale(LC_CTYPE, ".UTF-8") or setlocale(LC_CTYPE, (getdefaultlocale()[0], 'UTF-8')). ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KCCRN4T4TLOUH6GYQ3JDIPFZUUDA4QQA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/9/21, Inada Naoki wrote: > On Tue, Feb 9, 2021 at 7:42 PM M.-A. Lemburg wrote: > > But it affects to all Python installs. Can teachers recommend to set > PYTHONUTF8 environment variable for students? Users can simply create a shortcut that targets `cmd /k set PYTHONUTF8=1`. Optionally change the shortcut's "start in" directory to the desired working directory. >> Here's a good blog post about setting env vars on Windows: >> >> https://www.dowdandassociates.com/blog/content/ >> howto-set-an-environment-variable-in-windows-command-line-and-registry/ Command-line modification of the persistent environment is rarely required. Using setx.exe is okay for setting simple variables in CMD [1], such as `setx PYTHONUTF8 1`, combined with `set PYTHONUTF8=1` for the current shell. To do this in the GUI in Windows 10, click on the start button (or tap the WIN key) to show the start menu; type "environ"; and click on "Edit environment variables for your account". In the window that opens, click the "New" button; type "PYTHONUTF8" as the name and "1" (without quotes) as the value. Click the "OK" button on the dialog, and then click the "OK" button on the editor window. To test the value, assuming you have the py launcher installed, press WIN+R to open the run dialog. Type "py", and in the Python shell confirm that executing `import locale; locale.getpreferredencoding()` returns 'UTF-8'. --- [1] I would feel remiss in discussing "setx.exe" without warning about naively trying to modify PATH. For example, DO NOT execute a command like `setx.exe PATH "C:\Program Files\Python39;%PATH%"`. This is wrong because it sets the current PATH value, including the system part, as the user "Path" value, truncated to 1024 characters, and without the original dependence on system variables and independent (REG_SZ) user variables. Properly modifying the persistent "Path" from CMD is difficult and requires careful use of both reg.exe and setx.exe. It's easier in PowerShell. It's far easier to use the GUI editor, which in Windows 10 even provides an exploded list view that makes it simple to add/remove directories and move them up and down in the list. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WOTADHRBJRTMERYNVUOW4LMW3CIKHTDQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/6/21, Christopher Barker wrote: > On Sat, Feb 6, 2021 at 11:47 AM Eryk Sun wrote: > >> Relative to the installation, "python.cfg" should only be found in the >> same directory as the base executable, not its parent directory. > > OK, my mistake — I thought that was already the case with pyvenv.cfg. > Though I don’t get why it matters. Chiefly, I don't want to overload "pyvenv.cfg" with new behavior that's unrelated to virtual environments. I also dislike the way this file is found. If the parent directory is "C:\Program Files", then I'm not worried about finding "C:\Program Files\pyvenv.cfg" when the interpreter tries to open it. But this pattern is not safe in general when installed to an arbitrary directory, or with a portable distribution. The presence of a "._pth" file (Windows only) beside the DLL or executable bypasses the search for "pyvenv.cfg", among other things. The embedded distribution includes a ._pth that locks it down. This is another reason to use a different file to configure defaults for -X settings such as "utf8", a file that's guaranteed to always be read. >> Add an option in the installed "python.cfg" to set the name of the >> organization and application. > > That would work for, e.g. pyinstaller (which I hope already ignores these > kinds if configuration. > > But not for, e.g. web applications that expect to use virtual environments > to isolate themselves. The idea to use the profile data directories %ProgramData% and %LocalAppData% was for symmetry with how this could be supported in POSIX, which doesn't use the application directory as Windows does. The application "python.cfg" (in the directory of the executable, including a virtual environment) can support a setting to isolate it from system and user "python.cfg" files. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/T42D2VDMQ7JY7WYP2W3ALFHZGUYXLPZF/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/6/21, Christopher Barker wrote: > On Fri, Feb 5, 2021 at 12:59 PM Eryk Sun wrote: > > But why limit it to that? If there are more things to configure in an > environment-specific way — why not put it in this existing location? I'd rather not limit the capability to just virtual environments. > I'd prefer a new configuration file that sets the default values for >> -X implementation-specific options. The mechanism for finding this >> file can support virtual environments. > > Then wouldn’t that simply be two configuration files that will be treated > the same way? Relative to the installation, "python.cfg" should only be found in the same directory as the base executable, not its parent directory. If "pyvenv.cfg" is found, then it's a virtual environment, and "python.cfg" will also be looked for in the directory of "pyvenv.cfg", and supersedes settings in the base installation. > I’m still convinced that It is a bad idea to have User-wide Python > configuration like this. The fact is that different Python apps (may) need > different configurations, and environments are the way to support that. Add an option in the installed "python.cfg" to set the name of the organization and application. If not set, the organization and application respectively default to "Python" and "Python[-32]". Looking for system and user configuration would be parameterized using that name, i.e. "%ProgramData%\\\python.cfg" and "%LocalAppData%\\\python.cfg". ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TXKCDQL3JNCUG52M265LU5O7USBWO7D6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/6/21, Inada Naoki wrote: > > If adding option to pyvenv.cfg is not make sense, we can add > `python.ini` to same place pyvenv.cfg. i.e., directory containing > python.exe, or one above directory. I'd rather look for "python.cfg" in the directory of the base executable (e.g. "C:\Program Files\Python310") and then in the directory of "pyvenv.cfg", if the latter is found. I wouldn't want it to check for "python.cfg" in the parent directory of the base executable. > And no need for all installations, and per-user setting. > Environment variable is that already. A configuration file in a profile data directory can target a particular version, such as "%LocalAppData%\Python\Python310-32\python.cfg". This is more flexible for the user to override a system installation, compared to setting PYTHONUTF8. However, it's not a major issue if you don't want to support the extra flexibility. That said, supporting %ProgramData% and %LocalAppData% data directories is more consistent with how this feature would be implemented in POSIX, such as "/etc/python3.10/python.cfg" and "$HOME/.config/python310/python.cfg". I think that matters because this file would be a good place to set defaults for all -X options (e.g. "utf8", "pycache_prefix", "faulthandler"). ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3QGOQTRJTHQPQ5MQ2URCKKYBKASMAEH2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/5/21, Barry Scott wrote: >> On 5 Feb 2021, at 11:06, Inada Naoki wrote: > >> python.exe lookup pyvenv.cfg even outside of venv. >> So we can write utf8mode=1 in pyvenv.cfg even outside of venv. I don't like extending "pyvenv.cfg" with generic settings. This is a file to configure a virtual environment in terms of finding the standard library and packages. I'd prefer a new configuration file that sets the default values for -X implementation-specific options. The mechanism for finding this file can support virtual environments. > This is the problem that I was thinking about when I proposed using > a py.ini like solution where the file is looked for in the users config > folder. I think that is the %LOCALAPPDATA% folder for py.exe. It is standard practice and recommended to create a directory for the organization or project and optionally a child directory for each application, such as "%ProgramData%\Python\Python38-32\python.ini" and "%LocalAppData%\Python\Python38-32\python.ini". I would have preferred for the py launcher to read and merge settings for all existing configuration files in the order of "%ProgramData%\Python\py.ini" (all installations), "%__AppDir__%\py.ini" (particular installation), and "%LocalAppData%\Python\py.ini" (user). ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2W6V2WURBTGEXOE7CH4B73IMMGUNHY3W/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add a couple of options to open()'s mode parameter to deal with common text encodings
On 2/4/21, Ben Rudiak-Gould wrote: > > My proposal is to add a couple of single-character options to open()'s mode > parameter. 'b' and 't' already exist, and the encoding parameter > essentially selects subcategories of 't', but it's annoyingly verbose and > so people often omit it. > > If '8' was equivalent to specifying encoding='UTF-8', and 'L' was > equivalent to specifying encoding=(the real locale encoding, ignoring UTF-8 > mode), that would go a long way toward making open more convenient in the > common cases on Windows, and I bet it would encourage at least some of > those developing on Unixy platforms to write more portable code also. A precedent for using the mode parameter is [_w]fopen in MSVC, which supports a "ccs=" flag, where "" can be "UTF-8", "UTF-16LE", or "UNICODE". --- In terms of using the 'locale', keep in mind that the implementation in Windows doesn't use the current LC_CTYPE locale. It only uses the default locale, which in turn uses the process active (ANSI) code page. The latter is a system setting, unless overridden to UTF-8 in the application manifest (e.g. the manifest that's embedded in "python.exe"). I'd like to see support for a -X option and/or environment variable to make Python in Windows actually use the current locale to get the locale encoding (a real shocker, I know). For example, setlocale(LC_CTYPE, "el_GR") would select "cp1253" (Greek) as the locale encoding, while setlocale(LC_CTYPE, "el_GR.utf-8") would select "utf-8" as the locale encoding. (The CRT supports UTF-8 in locales starting with Windows 10, build 17134, released on 2018-04-03.) At startup, Python 3.8+ calls setlocale(LC_CTYPE, "") to use the default locale, for use with C functions such as mbstowcs(). This allows the default behavior to remain the same, unless the new option also entails attempting locale coercion to UTF-8 via setlocale(LC_CTYPE, ".utf-8"). The following gets the current locale's code page in C: #include <"locale.h"> // ... loc = _get_current_locale(); locinfo = (__crt_locale_data_public *)loc->locinfo; cp = locinfo->_locale_lc_codepage; The "C" locale uses code page 0. C mbstowcs() and wcstombs() handle this case as Latin-1. locale._get_locale_encoding() could instead map it to the process ANSI code page, GetACP(). Also, the CRT displays CP_UTF8 (65001) as "utf8". _get_locale_encoding() should map it to "utf-8" instead of "cp65001". ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MZC4DDCTMOX25ZQVUGBNLE6VPVXHXNKU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Make UTF-8 mode more accessible for Windows users.
On 2/2/21, Christopher Barker wrote: > > In the common case, folks have their environment variables set in an > initialization file (or the registry? I've lost track of what Windows does > these days) It hasn't fundamentally changed since the mid 1990s. Configurable system variables are set in the regsitry key "HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Environment", and configurable user variables are set in "HKCU\Environment". A process is spawned with an environment that's sourced from the parent process. Either it's inherited from the parent's environment or it's a new environment that was passed to CreateProcessW(). The ancestor of most interactive processes in a desktop session is the graphical shell, Explorer. At startup, it calls an undocumented shell32 function (RegenerateUserEnvironment) to load a new environment from scratch. It also reloads its environment in response to a WM_SETTINGCHANGE "Environment" message. The documented way to reload the environment from scratch is CreateEnvironmentBlock(, htoken, FALSE) and SetEnvironmentStringsW(env). ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZKIGW3CIM7GGRAVTXL2V44XZ4Q7GXG7Q/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Provide UTF-8 version of Python for Windows.
On 1/26/21, Eryk Sun wrote: > > The process active code page for GetACP() and GetOEMCP() is changed to > UTF-8 (65001). The C runtime also overrides the user locale to UTF-8 > if GetACP() returns UTF-8, i.e. setlocale(LC_CTYPE, "") will return > "utf8" as the encoding. One concern is what to do for the special "ansi" and "oem" encodings. If scripts rely on them for IPC, such as with subprocess.Popen(), then it could be frustrating if they're just synonyms for UTF-8 (code page 65001). I've tested that it's possible for Python to peg "ansi" and "oem" to the system ANSI and OEM code pages via GetLocaleInfoEx() with LOCALE_NAME_SYSTEM_DEFAULT and the LCType constants LOCALE_IDEFAULTANSICODEPAGE and LOCALE_IDEFAULTCODEPAGE (OEM). But then they're no longer accurate within the current process, for which ANSI and OEM are UTF-8. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5RCA3LVRBWVAHGDRGMR5RVAGP647NGDJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Provide UTF-8 version of Python for Windows.
On 1/25/21, Inada Naoki wrote: > > Microsoft provides UTF-8 code page for process. It can be enabled by > manifest file. > > How about providing Python binaris both of "UTF-8 version" and "ANSI > version"? I experimented with this manifest setting several months ago. To try it out, simply export the manifest from "python.exe", edit it to add the "activeCodePage" setting, and then replace it in "python.exe". The process active code page for GetACP() and GetOEMCP() is changed to UTF-8 (65001). The C runtime also overrides the user locale to UTF-8 if GetACP() returns UTF-8, i.e. setlocale(LC_CTYPE, "") will return "utf8" as the encoding. The console is hosted in a separate conhost.exe or openconsole.exe process, so it still defaults to the system OEM code page for its input and output code pages. This pertains only to low-level os.read() and os.write(). High-level console I/O uses io._WindowsConsoleIO for console files, which is internally UTF-16 and outwardly UTF-8. > * Windows team needs to maintain more versions. I suppose the installer could install both sets of binaries, and copy to "python[w][_d].exe" based on an installer option. But then the UTF-8 selection statistics wouldn't be tracked, unless the installer phones home. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/76TJ4CMMR2FXQGMKWOQCSBGVBG5DSN3K/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: built in to clear terminal
On 12/22/20, David Mertz wrote: > On Tue, Dec 22, 2020 at 10:26 PM Chris Angelico wrote: > > I'm not sure about Windows. Is 'cls' built into the command-line executable > itself (like Busybox) or is it an exe? CLS is an internal command of the CMD shell. An internal command takes precedence as long as the executed name is unquoted and has no extension. My concern with cls/clear is that they clear the scrollback. Is that what most people want from a clear_screen() function? ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PIVBSNSW5T7XTLVSYKM3PBEYUG2ROFMM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: built in to clear terminal
On 12/22/20, Barry Scott wrote: > > import sys > > def clear_terminal(): > if sys.platform == 'win32': > import ctypes > kernel32 = ctypes.windll.kernel32 > # turn on the console ANSI colour handling > kernel32.SetConsoleMode(kernel32.GetStdHandle(-11), 7) > > sys.stdout.write('\x1b[2J' '\x1b[H') Here are some concerns I have: * Does not support Windows 8 * Does not support legacy console in Windows 10 (on the "options" tab) * Does not check for SetConsoleMode failure * Does not support a different active screen buffer * Assumes StandardOutput is a screen buffer for the current console * Assumes the current mode of the screen buffer is 3 or 7. New modes have been added, and even more may be added * Sets a global console setting that persists after Python exits Like the CRT's "conio" API, clear_screen() should open "conout$" (temporarily), which will succeed if Python is attached to a console, regardless of the state of the standard handles, file descriptors, or sys.stdout, and will always open the currently active screen buffer, regardless of how many screen buffers exist in the current console session. The current mode should be queried for ENABLE_VIRTUAL_TERMINAL_PROCESSING (4) via GetConsoleMode(). If it's not enabled, bitwise OR it into the mode and try to enable it via SetConsoleMode(). If VT mode is enabled, write '\x1b[2J\x1b[H' to the file. If VT mode can't be enabled, then fall back on the legacy console API. In particular, some people mentioned not wanting to spawn a cmd.exe process just to use its CLS command. Even if spawning a process is okay, the CLS command clears the scrollback, which is inconsistent with ESC[2J. If clear_screen() is going to add ESC[3J to clear the scrollback, then it's at least consistent, but I'd rather not clear the scrollback. clear_screen() should be able to emulate ESC[2J via GetConsoleScreenBufferInfoEx (get the screen buffer size, window, cursor position, and default character attributes), ScrollConsoleScreenBuffer (if the screen buffer has to be scrolled up to make space), and SetConsoleScreenBufferInfoEx (shift the visible window in the buffer and set the cursor position). This can be implemented in ctypes or C. But normally the standard library avoids using ctypes. Finally, if VT mode was temporarily enabled, revert to the original mode, and always close the "conout$" file. Off topic comment: > kernel32 = ctypes.windll.kernel32 I recommend the following instead: kernel32 = ctypes.WinDLL('kernel32', use_last_error=True) The global library loaders such as ctypes.cdll and ctypes.windll are not reliable for production code in the wild. They cache CDLL library instances, which cache function pointers, which may have argtypes, restype, and errcheck prototypes. Another imported package might set function prototypes that break your code, which is an actual problem that I've seen a few times, particularly with common routines from kernel32, advapi32, and user32. It's not worth taking the chance of a conflict with another package just to save a few keystrokes. The global loaders also don't allow setting use_errno=True or use_last_error=True, so the function pointers they create don't capture the C errno value for ctypes.get_errno() or Windows last error value for ctypes.get_last_error(). Calling kernel32.GetLastError() after the fact may not be reliable in a scripting environment even if it's called directly after the previous FFI call. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AHDEAUNZNUT6EWT7GTEGSKKFL3GABZ4W/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: built in to clear terminal
On 12/20/20, Cameron Simpson wrote: > On 20Dec2020 15:48, Christopher Barker wrote: > >>That would be great, though I just looked at the 3.9 docs and saw: >>"The Windows version of Python doesn’t include the curses module." > > Yeah, Windows. A C or ctypes implementation is required in Windows. Virtual terminal mode is supported in Windows 10, for which it *might* be enabled. This can be queried via GetConsoleMode. If virtual terminal mode isn't enabled, then clearing the screen has to be implemented by scrolling the console screen buffer. The screen buffer size, visible rectangle, and current character attributes can be queried via GetConsoleScreenBufferInfo. The window can be scrolled via ScrollConsoleScreenBuffer. The entire buffer can be scrolled out, like the CMD shell's CLS command, or one can just scroll the buffer enough to clear the visible window. The cursor can be set to the home position via SetConsoleCursorPosition. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MRU5G2IB4CBSK5TAWKHS7EXIV6ECBEKO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Global flag for whether a module is __main__
On 11/12/20, Chris Angelico wrote: > > I actually don't use the "if name is main" idiom all that often. The > need to have a script be both a module and an executable is less > important than you might think. In a huge number of cases, it's > actually better to separate out the library-like and script-like > portions into separate files, or some other reorganization. One caveat is that the module __name__ check is required for multiprocessing in spawn mode (as opposed to fork mode), which is the only supported mode in Windows. Generally, I think scripts in installed packages are better handled via setuptools entrypoints nowadays. For cross-platform support, pip automatically does the right thing for Windows by creating executable .exe launchers. The issue is two-fold: the default action of the .py file association is often configured to edit rather than execute scripts, and, irrespective of the latter, many execution paths, such as subprocess.Popen, use CreateProcess from the base API, which does not support file associations. (File associations are the closest Windows has to Unix shebangs, and the basis for how the py.exe launcher supports Unix shebangs in scripts, but they're implemented in the high-level shell API instead of the base API.) ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ENI3R2KWR7Q2V4POOWJLLHMBRKLWMSNZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: New feature
On 10/18/20, Mike Miller wrote: > On 2020-10-17 17:05, Eryk Sun wrote: >> CMD's CLS is implemented with three API calls: >> GetConsoleScreenBufferInfo to get the screen-buffer dimensions and >> default text attributes, ScrollConsoleScreenBufferW to shift the >> buffer out, and SetConsoleCursorPosition to move the cursor to (0,0). > > Would you happen to have a link to some Python/ctypes code to implement > this? I would expect this to be implemented in posixmodule.c, not via ctypes. I can help with the implementation in C. Read the following pages in the console docs, if you haven't already: https://docs.microsoft.com/en-us/windows/console/scrolling-the-screen-buffer https://docs.microsoft.com/en-us/windows/console/scrolling-a-screen-buffer-s-contents ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZN7UFUZJASVZRXP5NVMM3BXTTYRGDMXW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: New feature
On 10/18/20, Mike Miller wrote: > > Also, a shell is not a terminal, so terminal routines don't feel right in > shutil. Putting get_terminal_size() there was a mistake imho. The shutil module "offers a number of high-level operations on files". ISTM that shutil.get_terminal_size is a high-level operation on sys.__stdout__, if it's a terminal/console device file, though it's an odd duck since the rest of the module is dealing with filesystem files. That said, rightly or wrongly, I think of shutil as a collection of shell utility (SHell UTILity) functions for Python's standard library, so I'm comfortable with expanding its mandate to functions commonly supported by CLI shell environments, such as terminal/console management. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/237RUHNA6IFRXPBBWHH3QHIEUNO77MRG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: New feature
On 10/17/20, Christopher Barker wrote: > > then how about os.clear_terminal() ? IMO, an os level function such as os.clear_terminal(fd) should only support terminal/console devices and would be implemented in Modules/posixmodule.c. Higher-level behavior and support for IDEs belongs in shutil. > Sure, there's a manageable set of default terminals across the major OSs > (and lInux Desktops), but there are a LOT of others as well, including > IDEs, and even the new Terminal in Windows: I would expect os.clear_terminal() to make exceptions only for popular terminals/consoles, if they don't support the common ANSI sequence to clear the screen. In Windows 10, you can enable virtual terminal (VT) mode by default for all non-legacy console sessions by setting "VirtualTerminalLevel" to 1 in "HKCU\Console". VT mode supports the standard ANSI sequences for clearing the terminal and/or scrollback. Regardless of the VirtualTerminalLevel setting, each tab in Windows Terminal is a headless pseudoconsole session (ConPTY) that has VT mode enabled by default. In all supported versions of Windows, if VT mode is disabled or not supported, as determined by GetConsoleMode, then the console screen buffer can be scrolled or cleared via GetConsoleScreenBufferInfo and ScrollConsoleScreenBuffer, and the cursor can be reset to (0,0) via SetConsoleCursorPosition. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7TNXGFFPEUUGX7GTL4S4JRV4Q42EJTLL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: New feature
On 10/16/20, Rob Cliffe via Python-ideas wrote: > > May I suggest it be called os.clearscreen()? I'd prefer shutil.clear_screen(). There's already shutil.get_terminal_size(). I know there's also os.get_terminal_size(), but its use isn't encouraged. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RSYOV4KUQOFKUUQETFRXPWJC64RV47RN/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: New feature
On 10/13/20, Mike Miller wrote: > > The legacy Windows console has another limitation in that I don't believe it > has a single API call to clear the whole thing. One must iterate over the > whole > buffer and write spaces to each cell, or some similar craziness. No, it's not really similar craziness -- at least not from the client program's perspective. The implementation in the console host itself is probably something like that. CMD's CLS is implemented with three API calls: GetConsoleScreenBufferInfo to get the screen-buffer dimensions and default text attributes, ScrollConsoleScreenBufferW to shift the buffer out, and SetConsoleCursorPosition to move the cursor to (0,0). https://docs.microsoft.com/en-us/windows/console/scrollconsolescreenbuffer The following debugger session is while stepped into CMD's eCls() function that implements the CLS command. This is just before it calls ScrollConsoleScreenBufferW, with parameters 1-4 in registers rcx, rdx, r8, and r9, and parameter 5 on the stack. lpScrollRectangle (rdx): the entire screen buffer (sized 125 x 9001) is to be scrolled. 0:000> ?? ((SMALL_RECT *)@rdx) struct _SMALL_RECT * 0x00e9`be8ff8b8 +0x000 Left : 0n0 +0x002 Top : 0n0 +0x004 Right: 0n125 +0x006 Bottom : 0n9001 dwDestinationOrigin (r9): the target row is -9001, so the contents of the entire buffer are shifted out. 0:000> ?? (short)(@r9 >> 16) short 0n-9001 lpFill (rsp / stack): use a space with the default attributes (in my case background color 0 and foreground color 7, in the current 16-color palette). 0:000> ?? ((CHAR_INFO **)@rsp)[4]->Char.UnicodeChar wchar_t 0x20 ' ' 0:000> ?? ((CHAR_INFO **)@rsp)[4]->Attributes unsigned short 7 ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SYMJ4HM6ZUBA3HMS5QXIDVSMQDRECHFP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: New feature
On 10/16/20, Steven D'Aprano wrote: > > On terminals that support it, this should work: > > - `print('\33[H\33[2J')` > > but I have no idea how to avoid clearing the scrollback buffer on > Windows, or other posix systems with unusual terminals. In Windows 10, ANSI sequences and some C1 control characters (e.g. clear via CSI -- '\x9b2J\x9bH') are supported by a console session if it's not in legacy mode. The ESC character can be typed as Ctrl+[, which is useful in the CMD shell, which doesn't support character escapes such as \33 or \x1b. It can also be set in an %ESC% environment variable. Using ANSI sequences and C1 controls requires virtual terminal (VT) mode to be enabled for the console screen buffer. VT mode is enabled by default in a pseudoconsole session (e.g. when attached to a tab in Windows Terminal), but it can be manually disabled. It's also enabled by default for non-legacy console sessions if "VirtualTerminalLevel" is set to 1 in "HKCU\Console". Regardless, it's simple to check whether VT mode is currently enabled for the screen buffer via WinAPI GetConsoleMode. If VT mode isn't enabled, the screen buffer can be scrolled using the console API function ScrollConsoleScreenBuffer using dimensions and attributes from GetConsoleScreenBufferInfo, and the cursor position can be set via SetConsoleCursorPosition. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2JOPCG55LD6I7S6673C3BNTH2EDSLSWH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: New feature
On 10/16/20, Barry Scott wrote: > > I find that you have to do this to turn on ANSI processing in CMD.EXE on > Window 10 and I assume earlier Windwows as wel: You mean the console-session host (conhost.exe). This has nothing to do with the CMD shell. People often confuse CLI shells (CMD, PowerShell, bash) with the console/terminal that they use for standard I/O. Virtual Terminal mode is supported by the new console in Windows 10 -- not in earlier versions of Windows and not with the legacy console in Windows 10. If you need to support ANSI sequences with the legacy console host, consider using a third-party library such as colorama. You can enable VT mode by default for regular console sessions (i.e. not headless sessions such as under Windows Terminal, for which it's always enabled) by setting a DWORD value of 1 named "VirtualTerminalLevel" in the registry key "HKCU\Console". > import ctypes > kernel32 = ctypes.windll.kernel32 > # turn on the console ANSI colour handling > kernel32.SetConsoleMode( kernel32.GetStdHandle( -11 ), 7 ) You should enable the flag in the current mode value and implement error handling: import ctypes kernel32 = ctypes.WinDLL('kernel32', use_last_error=True) STD_OUTPUT_HANDLE = -11 ENABLE_VIRTUAL_TERMINAL_PROCESSING = 4 INVALID_HANDLE_VALUE = ctypes.c_void_p(-1).value kernel32.GetStdHandle.restype = ctypes.c_void_p hstdout = kernel32.GetStdHandle(STD_OUTPUT_HANDLE) if hstdout == INVALID_HANDLE_VALUE: raise ctypes.WinError(ctypes.get_last_error()) mode = ctypes.c_ulong() if not kernel32.GetConsoleMode(hstdout, ctypes.byref(mode)): raise ctypes.WinError(ctypes.get_last_error()) mode.value |= ENABLE_VIRTUAL_TERMINAL_PROCESSING if not kernel32.SetConsoleMode(hstdout, mode): raise ctypes.WinError(ctypes.get_last_error()) ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UHIZNALCL7UGX5LXJACHHKOHMUMXACKN/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous
On 5/25/20, Christopher Barker wrote: > On Mon, May 25, 2020 at 10:59 AM Steve Barnes > wrote: > >> On Windows >> https://freetechtutors.com/create-virtual-hard-disk-using-diskpart-windows/ >> gives a nice description of creating a virtual disk with only operating >> system commands. Note that it shows the commands being used interactively >> but it can also be scripted by starting diskpart /s script.txt - > > ... > >> it does need to be run as admin. >> > > Well, darn. That seriously reduces its usefulness. Creating and mounting a VHD can be implemented in PowerShell as well, but it requires installing the Hyper-V management tools and services (not the hypervisor itself). The hyper-v cmdlets for managing virtual disks (e.g. mount-vhd) also require administrator access, since the underlying API does (e.g. AttachVirtualDisk). A new system service and client application would have to be developed in order to allow standard users to manage virtual disks. Here's a PowerShell example to create, format, and mount a volume on a 10 MiB virtual disk. This example mounts the volume both at "V:/" and at "C:/Mount/vhd_mount": $vhdpath = 'C:\Mount\temp.vhdx' $mountpath = 'C:\Mount\vhd_mount' # create and mount the physical disk as a RAW volume new-vhd -path $vhdpath -fixed -sizebytes (10 -shl 20) mount-vhd -path $vhdpath # create a partition on the disk, format it as NTFS, and assign # the DOS device name "V:" and label "vhd" $nd = (get-vhd -path $vhdpath).DiskNumber new-volume -disknum $nd -filesys ntfs -drive V -friendly vhd # set a folder mountpoint mkdir $mountpath add-partitionaccesspath -drive V -accesspath $mountpath If no drive letter is desired, use the -accesspath option of the new-volume cmdlet instead of the -driveletter option. The following command dismounts the disk: dismount-vhd -path $vhdpath The mountpoint on the empty directory remains set but inaccessible once the disk is dismounted. You can can delete this directory if the disk won't be mounted again. Or, while the disk is mounted, you can remove the mountpoint via remove-partitionaccesspath. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3EEZV32XMNGLGB6Q267MGYHPBSTO55FH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Sanitize filename (path part) 2nd try
On 5/13/20, Antoine Pitrou wrote: > > If you know of a system function which accepts filenames with embedded > NULs (which probably means it also takes the filename length as a > separate parameter), I'd be curious to know about it. Windows is layered over the base NT system, which uses counted strings and a root object namespace that reserves only the path separator, backslash. Null characters are allowed, at least as far as the object manager cares, but using them is a bad idea, if only because such names aren't generally accessible in Windows. But let's look at an example just for kicks. When the object manager parses a path up to a Device object (e.g. "\Device\NamedPipe"), the I/O manager takes over parsing the remaining path, which calls the device driver's IRP_MJ_CREATE routine with the remaining path. Whether or not a name with nulls is allowed depends on the device driver -- or a filesystem driver if the device is mounted. Almost all filesystem drivers reject a component name that contains nulls as invalid. One exception is the named-pipe filesystem (NPFS). NPFS doesn't disallow any characters. It even allows backslash in pipe names since it doesn't support subdirectories, and if you check via os.listdir('//./pipe'), you should see several Winsock pipes with backslash in their name. Creating a pipe with nulls in its name is impossible via WINAPI CreateNamedPipeW. It requires native NtCreateNamedPipeFile, with the name passed in an OBJECT_ATTRIBUTES record [1]. This system function is undocumented, but just to show that it's possible in principle, I created a pipe named "spam\x00eggs". We can query the name via GetFileInformationByHandleEx: FileNameInfo [2], which returns a counted string: >>> GetFileInformationByHandleEx(h, FileNameInfo) '\\spam\x00eggs' The name is in the root path of the device, but we don't get the fully-qualified name "\\Device\\NamedPipe\\spam\x00eggs". WINAPI GetFinalPathNameByHandleW [3] can figure this out, at least for the native NT path (from NtQueryObject). However, it works with null-terminated strings, so the pipe name gets truncated as "spam": >>> flags = VOLUME_NAME_NT | FILE_NAME_OPENED >>> GetFinalPathNameByHandle(h, flags) '\\Device\\NamedPipe\\spam' [1]: https://docs.microsoft.com/en-us/windows/win32/api/ntdef/ns-ntdef-_object_attributes [2]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_name_info [3]: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfinalpathnamebyhandlew ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6EB4UMWV3TRMWL6RPY2KFU7PYJTYF4SY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Sanitize filename (path part) 2nd try
On 5/11/20, Oleg Broytman wrote: > On Mon, May 11, 2020 at 09:12:52PM -, Steve Jorgensen > wrote: > >> When the platform is Windows, certainly, ":" should not be >> allowed, and perhaps colon should not be allowed at all. The meaning of ":name" is context dependent. If it occurs at the beginning of a path, it's relative to the working directory on drive ":", which defaults to the root directory on the drive. For example, if the working directory on drive "X:" is "X:\spam\eggs", then "X:foo" resolves to "X:\spam\eggs\foo". "X:foo" in this context is not a valid component name; it's actually a filepath. Otherwise ":" is part of an NTFS or ReFS stream path, where ":" is the stream delimiter. To be valid, it needs to be followed by either the name of the stream or the name plus the type, e.g. "filename:streamname" or "filename:streamname:streamtype". Should file streams be supported? More on File Streams An open or create will fail as an invalid filename if it uses invalid stream syntax or references a stream type that's unknown, or if the filesystem doesn't support streams and disallows colon in filenames (e.g. FAT32). The stream name can be empty to indicate an anonymous or default stream, but only if the stream type is specified. For example, in NTFS "filename::$DATA" is the anonymous data stream in a file named "filename". For a regular data file, it's the same as just accessing "filename". A directory can have named data streams, but it cannot have an anonymous data stream. The default stream in a directory is an index stream named "$I30". The following are equivalent names for a directory in NTFS: "dirname", "dirname::$INDEX_ALLOCATION", and "dirname:$I30:$INDEX_ALLOCATION". But "dirname:$I30" doesn't work because the default stream type is $DATA. To access a stream in a single-letter filename relative to the current directory, the current directory has to be referenced explicitly via the "." component. For example, "./C:spam" is a stream named "spam" in a file named "C" that's in the current working directory, but "C:spam" is a file named "spam" in the working directory on drive "C:". >Forbidden characters: > > chr(0) < > : " / \ | ? * > > characters in range from chr(1) through chr(31), See the above discussion regarding ":". An NTFS stream name can include any character except for nul (0), colon, backslash, and slash. The characters *?"<> are the 5 wildcards characters that almost all NT filesystems disallow in filenames. These are important to disallow because the filesystem driver (in the kernel) is expected to support filtering a directory listing with a wildcard pattern. NT's * and ? wildcards have Unix shell semantics. The other three are DOS_DOT ("), DOS_STAR (<), and DOS_QM (>), which help to emulate MS-DOS behavior. The vertical bar or pipe (|) has no significance in filepaths, but it's a special shell character that's usually disallowed in filenames. Control characters 1-31 usually are also disallowed. That said, some non-Microsoft filesystems may allow these characters. For example, the VirtualBox shared-folder filesystem allows pipe and control characters in filenames. > a space or a period at the end of file/directory name. Trailing spaces and dots are stripped from the final path component in almost all contexts. Except "\\?\" device paths are never normalized in an open or create context. For example, creating "\\?\C:\Temp\spam. . . " will name the file "spam. . . " instead of the normal name "spam". The name "spam. . . " will appear in the directory listing, but opening it will require using a "\\?\" device path. > Forbidden file names (with any extensions): > > CON, PRN, AUX, NUL, > COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, > LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9. In an attempt to replicate how MS-DOS implemented devices, Windows reserves DOS device names such as "NUL" in the final component of DOS drive-letter paths and relative paths. They are not reserved in the final component of UNC and device paths, though a server may disallow them by policy, as Microsoft's SMB server does. Matching the device name ignores everything after a trailing colon or dot that follows the name with 0 or more intervening spaces. This is more than ignoring an extension, which is typically taken as the characters following the last dot in a filename. "CONIN$" and "CONOUT$" are mistakenly excluded from the documented list of reserved DOS device names. Windows has always reserved them as unqualified relative names in a create/open context. Starting with Windows 8, they're reserved exactly the same as the classic DOS device names. Examples with trailing dots and spaces: >>> os.getcwd() 'C:\\' >>> nt._getfullpathname('spam. . . ') 'C:\\spam' >>> nt._getfullpathname('foo/spam. . . ') 'C:\\foo\\spam' DOS devices: >>> nt._getfullpathname('conin$:spam.eggs') '.\\conin$' >>> nt._getfullpathname('foo/conin$
[Python-ideas] Re: Improve the Windows python installer to reduce new user confusion
On 4/11/20, Barry Scott wrote: >> On 10 Apr 2020, at 20:14, Christopher Barker wrote: >> >> Also, if order to get python top level scripts to work, there needs to be >> a PATH entry for that, too. > > Do you mean the #! lines? That is taken care of by py.exe and how it was > installed. I think by "top level" Christopher means running "foo.py" directly, or just "foo" if ".PY" is in PATHEXT. The installer's option to update environment variables adds the "Scripts" directory to PATH and adds the .PY and .PYW file extensions to PATHEXT. It would be more flexible to split this out as an independent option. (Note that the "Scripts" directory also contains scripts that are embedded in a launcher executable, such as pip.exe, which distlib uses for entry-point scripts. But many entry-point scripts are commonly run via py.exe instead, such as `py -m pip`.) ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/22OFVG52S522CDTRZXLTL4DEX26522RS/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Improve the Windows python installer to reduce new user confusion
On 4/10/20, MRAB wrote: > On 2020-04-10 20:14, Christopher Barker wrote: > >> How does py.exe get on the PATH? >> > py.exe goes into the Windows folder, which is on the PATH. That's the typical setup, but a standard user that can't get OTS administrator access has to install the launcher just for the current user, in "%LocalAppData%\Programs\Python\Launcher", which the installer automatically adds to the per-user PATH. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5RIUDB7L4ON3SFIPEYBLOZCXQN3WUIYB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: About python3 on windows
On 3/25/20, Barry Scott wrote: >> On 25 Mar 2020, at 09:15, Eryk Sun wrote: >> >> That is not consistent with Unix. env is supposed to search PATH for >> the command. However, the launcher does not search PATH for a >> versioned command such as "python3". Instead it uses the highest >> version that's registered for 3.x or 2.x, respectively, or the version >> set by PY_PYTHON3 or PY_PYTHON2 if defined, respectively. > > I think the reasoning is that the whole point of the py.exe is to avoid > having users edit their PATH on Windows. And further the thinking > goes that you do not need the alternatively named python programs. The py launcher's "env" command searches PATH for anything from "python" to "notepad" -- but not for a versioned Python command such as "python3" or "python2". It always uses a registered installation in this case, which is at the very least problematic when using "#!/usr/bin/env python3" in an active virtual environment. Paul Moore will probably suggest that the script should use "#!/usr/bin/env python" instead, but that will run 2.x in most Unix systems unless a 3.x environment is active. We can assume that such a script requires 3.x and is meant to run flexibly, in or out of an active environment. I'd prefer a consistent implementation of the "env" command that doesn't special case versioned "pythonX[.Y]" commands compared to plain "python". But another option that will at least make virtual-environment users happy would be for "env" to check for an active VIRTUAL_ENV and read its Python version from "pyvenv.cfg". ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NEO6ZBIIGL2JWVG77SHUKNTWLY2ZFJ5G/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: About python3 on windows
On 3/25/20, Steve Barnes wrote: >> Except it's not necessarily what the original post wants. The OP wants the >> shebang "#!/usr/bin/env python3" to "work everywhere by >> default", for which I assume it's implied that it should work consistently >> everywhere. I'd prefer for the launcher's env search to also >> support versioned "pythonX[.Y][-32|-64]" commands such as "python3". > > The windows launcher already does support this with shebangs of: > #!/usr/bin/env python3 # Launch with the latest/preferred version of python3 > #!/usr/bin/env python2 # Launch with the latest/preferred version of python2 That is not consistent with Unix. env is supposed to search PATH for the command. However, the launcher does not search PATH for a versioned command such as "python3". Instead it uses the highest version that's registered for 3.x or 2.x, respectively, or the version set by PY_PYTHON3 or PY_PYTHON2 if defined, respectively. > #!/usr/bin/env python # Launch with the latest/preferred version of python 2 > unless PY_PYTHON=3[.n[-64/32]] is set or py.ini has the same in. In this case "env" first searches PATH before falling back on registered installations and PY_PYTHON, which is correct -- at least for the PATH search. I would prefer that "env" never checks registered installations. For the registry fallback, it should instead check the user and system "App Paths" key, like what ShellExecuteExW does. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YX4DOI4MWTB7AVL4QMT5EN4TQBNRSHEZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: About python3 on windows
On 3/25/20, Steve Barnes wrote: > Of course if, rather than creating symlinks, you create a batch file called > python3.bat and containing the line: > @py -3 %* Batch scripts execute via cmd.exe, with an attached console, and when Ctrl+C is typed they display a "Terminate batch job (Y/N)?" prompt when cmd.exe resumes. This makes them a poor substitute for a link. If the link is created beside "py.exe", it's better to use a relative symlink. If the link is in another directory, use a shell link (i.e. shortcut). Set the command to run as "C:\Windows\py.exe -3", or wherever "py.exe" is installed. The shell API will pass command line arguments. Clear the shortcut's "start in" field in order to inherit the parent's working directory. Add ".LNK" to PATHEXT to be able to run "python3" on the command line instead of requiring "python3.lnk". > It is also worth mentioning that python (and the py launcher) both accept > windows paths (\ separated) and *nix paths (/ separated) from the command > line and that from within scripts the *nix path separator is to be preferred That's generally true. Note however that the only reliable way to access a path that exceeds MAX_PATH characters (260, or less depending on context) is with a \\?\ extended path, which must use backslash. (Python 3.6+ does support long normal paths in Windows 10, but this capability has to be enabled at the system level. Plus many scripts and applications still need to support Windows 7 and 8.) An extended path is also required to open files with certain reserved names such as DOS devices (e.g. "con" or "nul:.txt") and names that end with spaces and dots (e.g. "spam. . ."). But please do not use an extended path in order to assign reserved names. It just causes needless problems. > glob.glob("C:/Users/Gadget/Documents/*.docx") - the only real issue to avoid > is the fact that Windows paths are case insensitive so names that differ > only in case changes can & will collide. A FAT32 filesystem is case insensitive in Unix (e.g. on a portable drive), so this problem isn't limited to Windows. It's just more common in Windows. Also, an NTFS directory tree can be flagged as case sensitive in Windows 10, but thankfully this isn't commonly used, even by developers. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HHOB2HGK7AN7ZATWAS2GPBMN3CDOLGKY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: About python3 on windows
On 3/24/20, Mike Miller wrote: > On 2020-03-24 11:58, Eryk Sun wrote: > >> You can manually copy or symlink python.exe to python3.exe in the >> installation directory and venv "Scripts" directories. However, it >> will only be used on the command line, and other contexts that search >> PATH. Currently the launcher will not use it with a virtual "env" >> shebang. The launcher will search PATH for "python", but not >> "python3". > > Thanks. Sure, there are many ways to fix this manually, or work around it. Except it's not necessarily what the original post wants. The OP wants the shebang "#!/usr/bin/env python3" to "work everywhere by default", for which I assume it's implied that it should work consistently everywhere. I'd prefer for the launcher's env search to also support versioned "pythonX[.Y][-32|-64]" commands such as "python3". I'd also prefer the env search to check the user and system "App Paths" key [1] if the name isn't found in PATH. Each subkey of "App Paths" is the name of a command such as "python3.exe", for which the fully-qualified executable filename is the key's default value. This is the Windows shell API equivalent of creating symlinks in "~/.local/bin" and "/usr/bin" on Unix systems. ShellExecuteExW checks "App Paths", but the launcher has to use CreateProcessW, which is beneath the shell API. > Would be great if it was consolidated, with one command "to rule them all." I'm in favor of "py" becoming the cross-platform command to run Python from the command line, since there's already a lot of inertia in that direction on Windows. Brett Cannon is working on a Unix version [2]. [1]: https://docs.microsoft.com/en-us/windows/win32/shell/app-registration#using-the-app-paths-subkey [2]: https://crates.io/crates/python-launcher ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VHHZ263OEMNISHMBTPPTC7OVYJNA4KIO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: About python3 on windows
On 3/24/20, Mike Miller wrote: > > C:\Users\User>python3 > (App store loads!!) If installed, the app distribution has an appexec link for "python3.exe" that actually works. > C:\Python38>dir > Volume in drive C has no label. > [snip] > Note there is no python3.exe binary. You can manually copy or symlink python.exe to python3.exe in the installation directory and venv "Scripts" directories. However, it will only be used on the command line, and other contexts that search PATH. Currently the launcher will not use it with a virtual "env" shebang. The launcher will search PATH for "python", but not "python3". ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6U64JXGJZ4TFTCXJ6X636AYI5QYQLVMX/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: About python3 on windows
On 3/24/20, Barry Scott wrote: > > If you have python 2 and 3 installed then > >py -3 myscript "myscript" may have a shebang that runs the "python2" virtual command (e.g. "#!python2" or "#!/usr/bin/python2") because the script requires 2.x, but using "-3" will override it to run the "python3" virtual command instead. The "python2" virtual command defaults to the highest installed version of 2.x. The "python3" virtual command defaults to the highest installed version of 3.x. Without a shebang, the "python" virtual command defaults to the highest installed 3.x, but with a shebang it defaults to the highest installed 2.x. py without a script (e.g. the REPL or -c or -m) uses the "python" virtual command, so it defaults to the highest installed 3.x. The version to run for the "python" virtual command is set via the PY_PYTHON environment variable, whether or not there's a shebang. Similarly the version to run for the "python2" and "python3" virtual commands is set via PY_PYTHON2 and PY_PYTHON3. No sanity checks are performed, so "PY_PYTHON2=3" is allowed. >> #! /usr/bin/env python3 > > This does work out of the box because py.exe is run when you execute a .py > in the CMD. The "/usr/bin/env" virtual command is expected to search PATH. py does search PATH for "/usr/bin/env python", but for "/usr/bin/env python3" it uses the "python3" virtual command instead of searching, since standard Python installations and virtual environments do not include "python3.exe". There's an open issue for this, but there's no consensus. > You can check by doing: > > assoc .py > ftype Python.File > > If Python.File is not using py.exe then you can fix that with this command > from an Admin CMD. > > ftype Python.File="C:\windows\py.exe" "%1" %* CMD's internal assoc and ftype commands are no longer useful in general. They date back to Windows NT 4 (1996) and have never been updated. As far as I know, Microsoft has no up-to-date, high level commands or PowerShell cmdlets to replace assoc and ftype. In PowerShell I suppose you could pInvoke the shell API (e.g. AssocQueryStringW). assoc and ftype only access the basic system file types and progids in "HKLM\Software\Classes", not the HKCR view that merges in and prefers "HKCU\Software\Classes" or various other subkeys such as "Applications" and "SystemFileAssociations". Also, they don't account for the cached and locked user choice in the shell. For example, assoc may tell you that ".py" is associated with the "Python.File" progid. But it's potentially wrong. It's not aware of an association set in "HKCU\Software\Classes\.py" (e.g. set by a per-user Python installation). It's also not aware of a locked-in user choice (i.e. the user selected to always use a particular app), if one is set, in "HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\FileExts\.py\UserChoice". The user should set the file association using the settings dialog for choosing default apps by file type or the "open with" dialog on the right-click context menu. If the application chooser doesn't include a Python icon with a rocket on it, then probably the launcher and Python.File progid are installed for all users, but there's a per-user association in "HKCU\Software\Classes\.py" that's overriding the system setting. Deleting the default value in the latter key should restore the launcher to the list of choices. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/C3ZH4TZLN3APOF3PVSEEZM6XIVCIIFVG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Control adding script path/CWD to sys.path
On 2/24/20, jdve...@gmail.com wrote: > > I try to use along with -m (`python -I -m a.b`) and get this error: "python: > Error while finding module specification for 'a.b' (ModuleNotFoundError: No > module named 'a')". This is a use case for -m that requires adding the working directory to sys.path. I work in virtual environments, and I don't navigate into a package and execute modules. The target package is always either in the standard library or installed in site-packages, and the module is executed from the top level. So for me adding the working directory is a feature I never need, and I completely forgot about why anyone would want it. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RAZRNATFDLGS3MIAIEXXLKQTD447UK3P/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Control adding script path/CWD to sys.path
On 2/24/20, jdve...@gmail.com wrote: > > It is the intended and the expected behaviour. The working directory is > always added to the sys.path. You mean always in this particular context, i.e. the working directory is added normally when executing a command via -c or a module as a script via -m. When executing a script normally, the script directory gets added, which is reasonably secure. Adding the working directory to sys.path is ok for the interactive shell and -c commands, but I don't understand why it gets added with -m, which is a security hole, and to me an annoyance. It can be disabled with isolated mode, but that's a blunt instrument that disables too much. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6SSCBUIPMFJC2ZR67DVTHICN3B5UDX2F/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Add logging to subprocess.Popen
On 2/24/20, Guido van Rossum wrote: > > The stdlib does very little logging of its own -- logging is up to > the application. It's not logging per se, but the standard library does have an extensive and growing list of audit events that are intended to assist with testing, logging and security monitoring. https://docs.python.org/3/library/audit_events.html https://www.python.org/dev/peps/pep-0578 An event is generated for subprocess.Popen that includes the executable, args, cwd, and env parameters. There's no event for the result, however. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OZREZJRFMJERP4OMG23BEPVPGPUYBXU7/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Recommend UTF-8 mode on Windows
On 1/14/20, Inada Naoki wrote: > > UTF-8 mode shouldn't take precedence over legacy FS encoding. > > Mercurial uses legacy encoding for file paths. They use > sys._enablelegacywindowsfsencoding() on Windows. > https://www.mercurial-scm.org/repo/hg/rev/8d5489b048b7 This runtime call can override the initial configuration that's based on environment variables and -X options. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FXFBPIIZDFEZR5WVXVOMKMA5KLK3SNGH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Recommend UTF-8 mode on Windows
On 1/10/20, Andrew Barnert via Python-ideas wrote: > On Jan 10, 2020, at 03:45, Inada Naoki wrote: > > Also, PYTHONUTF8 is only supported on Unix, so presumably it’s ignored if > you set it on Windows, right? The implementation of UTF-8 mode (i.e. -Xutf8) is cross-platform, though I think it could use some tweaking for Windows. >> I believe UTF-8 should be chosen by default for text encoding. > > Correct me if I’m wrong, but I think in Python 3.7 on Windows 10, the > filesystem encoding is already UTF-8, and the stdio console files are UTF-8 > (but under the covers actually wrap the native UTF-16 console APIs instead > of using msvcrt stdio), so the only issue is the locale encoding, right? Yes, 3.6+ in Windows defaults to UTF-8 for console I/O and the filesystem encoding. If for some reason you need the legacy behavior, it can be enabled via the following environment variables [1]: PYTHONLEGACYWINDOWSSTDIO and PYTHONLEGACYWINDOWSFSENCODING. Setting PYTHONLEGACYWINDOWSFSENCODING switches the filesystem encoding to "mbcs". Note that this does not use the system MBS (multibyte string) API. Python simply transcodes between UTF-16 and ANSI instead of UTF-8. Currently this setting takes precedence over UTF-8 mode, but I think it should be the other way around. Setting PYTHONLEGACYWINDOWSSTDIO uses the console input codepage for stdin and the console output codepage for stdout and stderr, but only if isatty is true and the process is attached to a console (see _Py_device_encoding in Python/fileutils.c). Otherwise it uses the system ANSI codepage. Note that this setting is currently **broken** in 3.8. In Python/initconfig.c, config_init_stdio_encoding calls config_get_locale_encoding to set config->stdio_encoding. This always uses the system ANSI codepage (e.g. 1252), even for console files for which this choice makes no sense. Combining UTF-8 mode with legacy Windows standard I/O is generally dysfunctional. The result is mojibake, unless the console codepage happens to be UTF-8. I'd prefer UTF-8 mode to take precedence over legacy standard I/O mode and have it imply non-legacy I/O. In both of the above cases, what I'd prefer is for UTF-8 mode to take precedence over legacy modes, i.e. to disable config->legacy_windows_fs_encoding and config->legacy_windows_stdio in the startup configuration. Regarding the MBS API and UTF-8 In Windows 10, it's possible to set the ANSI and OEM codepages to UTF-8 at both the system level (in the system control panel) and the application level (in the application manifest). But many functions are still only available in the WCS (wide-character string) API, such as GetLocaleInfoEx, GetFileInformationByHandleEx, and SetFileInformationByHandle. I don't know whether Microsoft plans to implement MBS wrappers in these cases. If the ANSI codepage is UTF-8, then the MBS file API (e.g. CreateFileA) is basically equivalent to Python's UTF-8 filesystem encoding. There's one exception. Python uses the "surrogatepass" error handler, which allows invalid surrogate codes (i.e. a "Wobbly" WTF-8 encoding). In contrast, the MBS API translates invalid surrogates to the replacement character (U+FFFD). I think Python's choice is more sensible because the WCS file API (e.g. CreateFileW) and filesystem drivers do not verify that strings are valid Unicode. The console uses the system OEM codepage as its default I/O codepage. Setting OEM to UTF-8 (at the system level, not at the application level), or manually setting the codepage to UTF-8 via `chcp.com 65001`, is a potential problem because the console doesn't support reading non-ASCII UTF-8 strings via ReadFile or ReadConsoleA. Prior to Windows 10, it returns an empty string for this case, which looks like EOF. The new console in Windows 10 instead translates each non-ASCII character as a null byte (e.g. "SPĀM" -> "SP\x00M"), which is better but still pretty much useless for reading non-English input. Python 3.6+ is for the most part immune to this. In the default configuration, it uses ReadConsoleW to read UTF-16 instead of relying on the input codepage. (Low-level os.read is not immune to the problem, however, because it is not integrated with the new console I/O implementation.) [1] https://docs.python.org/3/using/cmdline.html#environment-variables ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/G2NOSM6EFOOO5WCLTCEWJ7DWS57DDZTY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Suggestion: Windows launcher default to not using pre-releases by default
On 7/10/19, Brendan Barnwell wrote: > > I agree that it seems the real problem here is the lack of a real way > to determine if an available version is a real release or a > prerelease/beta. Is it not possible to change that, so that it is > possible for the launcher to quickly and easily determine the highest > release version available? In a previous reply, I gave a simple example based on FIELD3 of the file version (or product version) that's embedded in python[w].exe. This doesn't require changes to the registry, and doesn't require running the executable to parse version information from stdout, which would be relatively slow. It will only work for releases that have the version info in the executable. I don't recall when we started adding it, but I know the 2.7 executable doesn't have it. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/J7GPFOZYKEBI5BHYIWZIPVYX2UWBMLA2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Suggestion: Windows launcher default to not using pre-releases by default
On 7/9/19, Steve Barnes wrote: > > Currently the py[w] command will launch the latest python by default however > I feel that this discourages the testing of pre-releases & release > candidates as once they are installed they will become the default. What I > would like is for the default to be the highest version number of a full > release but the user to be able to specify a specific version even if it is > a pre-release. With the existing launcher, if we install a pre-release candidate, we can set the PY_PYTHON environment variable to make the launcher default to a preferred stable release. To modify the launcher to detect a "final" build, we can check the file version from the PE image's FIXEDFILEINFO [1]. It consists of four 16-bit values: PY_MAJOR_VERSION, PY_MINOR_VERSION, FIELD3, PYTHON_API_VERSION. What we want is FIELD3, which is the upper WORD in the least significant DWORD (i.e. dwFileVersionLS >> 16). FIELD3 is computed as micro * 1000 + levelnum * 10 + serial where levelnum is alpha: 10 beta: 11 candidate: 12 final: 15 and serial is 0-9. The executable is a "final" release if FIELD3 modulo 1000 is at least 150. Here's a quick ctypes example with Python 3.7.3: version = ctypes.WinDLL('version', use_last_error=True) szBlock = version.GetFileVersionInfoSizeW(sys.executable, None) block = (ctypes.c_char * szBlock)() version.GetFileVersionInfoW(sys.executable, 0, szBlock, block) pinfo = ctypes.POINTER(ctypes.c_ulong)() szInfo = ctypes.c_ulong() version.VerQueryValueW(block, '\\', ctypes.byref(pinfo), ctypes.byref(szInfo)) >>> (pinfo[3] >> 16) % 1000 150 >>> sys.version_info sys.version_info(major=3, minor=7, micro=3, releaselevel='final', serial=0) [1]: https://docs.microsoft.com/en-us/windows/win32/api/verrsrc/ns-verrsrc-tagvs_fixedfileinfo ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IZBCCXXOZNDW6XEZUP3WSGSRRIXVJOVG/ Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] shutil.symlink to allow non-race replacement of existing link targets
On 5/14/19, Steven D'Aprano wrote: > > On posix systems, you should be able to use chattr +i to make the file > immutable, so that the attacker cannot remove or replace it. Minor point of clarification. File attributes, and APIs to access them, are not in the POSIX standard. chattr is a Linux command that wraps the filesystem IOCTLs for getting and setting file attributes. There's no chattr system call, so thus far it's not supported in Python's os module. BSD and macOS have chflags, which supports both system- and user-immutable file attributes. Python supports it as os.chflags. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] shutil.symlink to allow non-race replacement of existing link targets
On 5/14/19, Serge Matveenko wrote: > > My point was that in case of `os.symlink` vs `shutil.symlink` it is > not obvious how they are different even taking into account their > namespaces. I prefer to reserve POSIX system call names if possible, unless it's a generic name such as "open" or "close". Note that there's also the possibility of extending pathlib's `symlink_to` method. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Provide additional debug info for OSError and WindowsError
On 4/12/19, Giampaolo Rodola' wrote: > > As such I was thinking that perhaps it would be nice to provide 2 new > cPython APIs: > > PyErr_SetFromErrnoWithMsg(PyObject *type, const char *msg) > PyErr_SetFromWindowsErrWithMsg(int ierr, const char *msg) > PyErr_SetExcFromWindowsErrWithMsg(PyObject *type, int ierr, const char > *msg) > > With this in place also OSError and WindowsError would probably have > to host a new "extramsg" attribute or something (but not necessarily). Existing error handling would benefit from this proposal. win32_error [1], win32_error_object_error, and PyErr_SetFromWindowsErrWithFunction [2] take a function name that's currently ignored. [1]: https://github.com/python/cpython/blob/v3.7.3/Modules/posixmodule.c#L1403 [2]: https://github.com/python/cpython/blob/v3.7.3/PC/winreg.c#L26 ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add subprocess.Popen suspend() and resume()
On 3/24/19, Giampaolo Rodola' wrote: > On Wed, Mar 20, 2019 at 11:19 PM eryk sun wrote: > >> This code repeatedly calls PsGetNextProcessThread to walk the >> non-terminated threads of the process in creation order (based on a >> linked list in the process object) and suspends each thread via >> PsSuspendThread. In contrast, a Tool-Help thread snapshot is >> unreliable since it won't include threads created after the snapshot >> is created. The alternative is to use a different undocumented system >> call, NtGetNextThread [2], which is implemented via >> PsGetNextProcessThread. But that's slightly worse than calling >> NtSuspendProcess. >> >> [1]: https://stackoverflow.com/a/11010508 >> [2]: https://github.com/processhacker/processhacker/blob/v2.39/ >> phnt/include/ntpsapi.h#L848 > > FWIW older psutil versions relied on Thread32Next / OpenThread / > SuspendThread / ResumeThread, which appear similar to these Ps* > counterparts (and I assume have the same drawbacks). This is the toolhelp snapshot I was talking about, which is an unreliable way to pause a process since it doesn't include threads created after the snapshot. For TH32CS_SNAPTHREAD, it's based on calling NtQuerySystemInformation: SystemProcessInformation to take a snapshot of all running processes and threads at the time. This buffer gets written to a shared section, and the section handle is returned as the snapshot handle. Thread32First and Thread32Next are called to walk the buffer a record at a time by temporarily mapping the section with NtMapViewOfSection and NtUnmapViewOfSection. In contrast, NtSuspendProcess is based on PsGetNextProcessThread, which walks a linked list of the non-terminated threads in the process. Unlike a snapshot, this won't miss threads created after we start, since new threads are appended to the list. To implement this in user mode with SuspendThread would require the NtGetNextThread system call that's implemented via PsGetNextProcessThread. But that's just trading one undocumented system call for another at the expense of a more complicated implementation. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add subprocess.Popen suspend() and resume()
On 3/18/19, Giampaolo Rodola' wrote: > > I've been having these 2 implemented in psutil for a long time. On > POSIX these are convenience functions using os.kill() + SIGSTOP / > SIGCONT (the same as CTRL+Z / "fg"). On Windows they use > undocumented NtSuspendProcess and NtResumeProcess Windows > APIs available since XP. Currently, Windows Python only calls documented C runtime-library and Windows API functions. It doesn't directly call NT runtime-library and system functions. Maybe it could in the case of documented functions, but calling undocumented functions in the standard library should be avoided. Unfortunately, without NtSuspendProcess and NtResumeProcess, I don't see a way to reliably implement this feature for Windows. I'm CC'ing Steve Dower. He might say it's okay in this case, or know of another approach. DebugActiveProcess, the other simple approach mentioned in the linked SO answer [1], is unreliable and has the wrong semantics. A process only has a single debug port, so DebugActiveProcess will fail the PID as an invalid parameter if another debugger is already attached to the process. (The underlying NT call, DbgUiDebugActiveProcess, fails with STATUS_PORT_ALREADY_SET.) Additionally, the semantics that I expect here, at least for Windows, is that each call to suspend() will require a corresponding call to resume(), since it's incrementing the suspend count on the threads; however, a debugger can't reattach to the same process. Also, if the Python process exits while it's attached as a debugger, the system will terminate the debugee as well, unless we call DebugSetProcessKillOnExit(0), but that interferes with the Python process acting as a debugger normally, as does this entire wonky idea. Also, the debugging system creates a thread in the debugee that calls NT DbgUiRemoteBreakin, which executes a breakpoint. This thread is waiting, but it's not suspended, so the process will never actually appear as suspended in Task Manager or Process Explorer. That leaves enumerating threads in a snapshot and calling OpenThread and SuspendThread on each thread that's associated with the process. In comparison, let's take an abridged look at the guts of NtSuspendProcess. nt!NtSuspendProcess: ... mov r8,qword ptr [nt!PsProcessType] ... callnt!ObpReferenceObjectByHandleWithTag ... callnt!PsSuspendProcess ... mov ebx,eax callnt!ObfDereferenceObjectWithTag mov eax,ebx ... ret nt!PsSuspendProcess: ... callnt!ExAcquireRundownProtection cmp al,1 jne nt!PsSuspendProcess+0x74 ... callnt!PsGetNextProcessThread xor ebx,ebx jmp nt!PsSuspendProcess+0x62 nt!PsSuspendProcess+0x4d: ... callnt!PsSuspendThread ... callnt!PsGetNextProcessThread nt!PsSuspendProcess+0x62: ... testrax,rax jne nt!PsSuspendProcess+0x4d ... callnt!ExReleaseRundownProtection jmp nt!PsSuspendProcess+0x79 nt!PsSuspendProcess+0x74: mov ebx,0C10Ah (STATUS_PROCESS_IS_TERMINATING) nt!PsSuspendProcess+0x79: ... mov eax,ebx ... ret This code repeatedly calls PsGetNextProcessThread to walk the non-terminated threads of the process in creation order (based on a linked list in the process object) and suspends each thread via PsSuspendThread. In contrast, a Tool-Help thread snapshot is unreliable since it won't include threads created after the snapshot is created. The alternative is to use a different undocumented system call, NtGetNextThread [2], which is implemented via PsGetNextProcessThread. But that's slightly worse than calling NtSuspendProcess. [1]: https://stackoverflow.com/a/11010508 [2]: https://github.com/processhacker/processhacker/blob/v2.39/phnt/include/ntpsapi.h#L848 ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Running Python commands from a Shell
On 2/1/19, Steven D'Aprano wrote: > On Fri, Feb 01, 2019 at 07:21:47PM -0600, eryk sun wrote: > >> As soon as "pipe" is mentioned, anyone familiar with the REPL's >> behavior with pipes should know that making this work will require the >> -i command-line option to force interactive mode. Otherwise stdout >> will be fully buffered. For example: > [...] > > I wonder... could Python automatically detect when it is connected to > pipes and switch buffering off? In most cases we want full buffering when standard I/O is a pipe or disk file. It's more efficient to read/write large chunks from/to the OS. In another message I saw -u mentioned to disable buffering. But that's not sufficient. We need -i to force running the built-in REPL over a pipe, and optionally -q to quiet the initial banner message. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Option of running shell/console commands inside the REPL
On 2/1/19, Terry Reedy wrote: > On 2/1/2019 3:31 PM, Oleg Broytman wrote: > >> Python REPL is missing the following batteries: >> * Persistent history; Python's built-in REPL relies on the readline module for history. In Windows you'll need to install pyreadline, an implementation that uses the Windows console API via ctypes. Out of the box, Python uses the the built-in line editing and history that's provided by the Windows console host (conhost.exe). There's an undocumented function to read this history (as used by doskey.exe), but there's no function to load lines into it. I suppose it could be replayed manually in a loop that calls WriteConsoleInputW and ReadConsoleW. > * Windows Console holds a maximum of characters, lines, not characters. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Running Python commands from a Shell
On 2/1/19, Steven D'Aprano wrote: > On Fri, Feb 01, 2019 at 04:28:25PM -0600, Dan Sommers wrote: > >> As I indicated in what you quoted, shell co-processes allow you to run a >> command in the background and interact with that command from your >> shell. > > Okay, but what does that mean in practice? What does it require to make > it work with Python? What is your expected input and output? bash coproc runs a process in the background with stdin and stdout redirected to pipes. The file descriptors for our end of the pipes are available in an array with the given name (e.g. P3). The default array name is COPROC. As soon as "pipe" is mentioned, anyone familiar with the REPL's behavior with pipes should know that making this work will require the -i command-line option to force interactive mode. Otherwise stdout will be fully buffered. For example: $ coproc P3 { python3 -qi 2>&1; } [1] 16923 $ echo 'import sys; print(sys.version)' >&${P3[1]} $ read -t 1 <&${P3[0]} && echo $REPLY >>> 3.6.7 (default, Oct 22 2018, 11:32:17) $ read -t 1 <&${P3[0]} && echo $REPLY [GCC 8.2.0] $ read -t 1 <&${P3[0]} && echo $REPLY $ echo 'sys.exit(42)' >&${P3[1]} $ [1]+ Exit 42 coproc P3 { python3 -qi 2>&1; } > And are we supposed to know what ">&${P3[1]}" does? It looks like your > cat walked over your keyboard. It redirects the command's standard output (>) to the file descriptor (&) in index 1 of the P3 array (${P3[1]}), which is our end of the pipe that's connected to stdin of the co-process. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] struct.unpack should support open files
On 12/25/18, Steven D'Aprano wrote: > On Tue, Dec 25, 2018 at 04:51:18PM -0600, eryk sun wrote: >> >> Alternatively, we can memory-map the file via mmap. An important >> difference is that the mmap buffer interface is low-level (e.g. no >> file pointer and the offset has to be page aligned), so we have to >> slice out bytes for the given offset and size. We can avoid copying >> via memoryview slices. > > Seems awfully complicated. How do we do all these things, and what > advantage does it give? Refer to the mmap and memoryview docs. It is more complex, not significantly, but not something I'd suggest to a novice. Anyway, another disadvantage is that this requires a real OS file, not just a file-like interface. One possible advantage is that we can work naively and rely on the OS to move pages of the file to and from memory on demand. However, making this really convenient requires the ability to access memory directly with on-demand conversion, as is possible with ctypes (records & arrays) or numpy (arrays). Out of the box, multiprocessing works like this for shared-memory access. For example: import ctypes import multiprocessing class Record(ctypes.LittleEndianStructure): _pack_ = 1 _fields_ = (('a', ctypes.c_int), ('b', ctypes.c_char * 4)) a = multiprocessing.Array(Record, 2) a[0].a = 1 a[0].b = b'spam' a[1].a = 2 a[1].b = b'eggs' >>> a._obj Shared values and arrays are accessed out of a heap that uses arenas backed by mmap instances: >>> a._obj._wrapper._state ((, 0, 16), 16) >>> a._obj._wrapper._state[0][0].buffer The two records are stored in this shared memory: >>> a._obj._wrapper._state[0][0].buffer[:16] b'\x01\x00\x00\x00spam\x02\x00\x00\x00eggs' >> We can also use ctypes instead of >> memoryview/struct. > > Only if you want non-portable code. ctypes has good support for at least Linux and Windows, but it's an optional package in CPython's standard library and not necessarily available with other implementations. > What advantage over struct is ctypes? If it's available, I find that ctypes is often more convenient than the manual pack/unpack approach of struct. If we're writing to the file, ctypes lets us directly assign data to arrays and the fields of records on disk (the ctypes instance knows the address and its data descriptors handle converting values implicitly). The tradeoff is that defining structures in ctypes can be tedious (_pack_, _fields_) compared to the simple format strings of the struct module. With ctypes it helps to already be fluent in C. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] struct.unpack should support open files
On 12/24/18, Drew Warwick wrote: > The struct unpack API is inconvenient to use with files. I must do: > > struct.unpack(fmt, file.read(struct.calcsize(fmt)) Alternatively, we can memory-map the file via mmap. An important difference is that the mmap buffer interface is low-level (e.g. no file pointer and the offset has to be page aligned), so we have to slice out bytes for the given offset and size. We can avoid copying via memoryview slices. We can also use ctypes instead of memoryview/struct. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PEP proposal -- Pathlib Module Should Contain All File Operations -- version 2
On Sat, Mar 17, 2018 at 10:42 AM, George Fischhofwrote: > > All functions from os module accept path-like objects, > and none of the shutil functions. shutil indirectly supports __fspath__ paths via os and os.path. One exception is shutil.disk_usage() on Windows, which only supports str strings. This is fixed in 3.7, in resolution of issue 32556. Maybe it should be backported to 3.6. I like the idea of a high-level library that provides a subset of commonly used os, io, and shutil functionality in one place. But maybe a new module isn't required. shutil could be extended since its design goal is to provide "high-level operations on files and collections of files". That said, pathlib's goal to support "concrete paths [that] provide I/O operations" does seem incomplete. It should support copy, copytree, rmtree, and move methods. Also, a `parents` option should be added to Path.rmdir to implement removedirs, which mirrors how Path.mkdir implements makedirs. > os.link => path.hardlink_to I'm surprised this doesn't already exist. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Descouraging the implicit string concatenation
On Wed, Mar 14, 2018 at 12:18 PM, Facundo Batistawrote: > > Note that there's no penalty in adding the '+' between the strings, > those are resolved at compilation time. The above statement is not true for versions prior to 3.7. Previously the addition of string literals was optimized by the peephole optimizer, with a limit of 20 characters. Do you mean to formally discourage implicit string-literal concatenation only for 3.7+? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Memory limits [was Re: Membership of infinite iterators]
On Thu, Oct 19, 2017 at 9:05 AM, Stephan Houbenwrote: > > I (quickly) tried to get something to work using the win32 package, > in particular the win32job functions. > However, it seems setting > "ProcessMemoryLimit" using win32job.SetInformationJobObject > had no effect > (i.e. a subsequent win32job.QueryInformationJobObject > still showed the limit as 0)? Probably you didn't set the JOB_OBJECT_LIMIT_PROCESS_MEMORY flag. Here's an example that tests the process memory limit using ctypes to call VirtualAlloc, before and after assigning the current process to the Job. Note that the py.exe launcher runs python.exe in an anonymous Job that's configured to kill on close (i.e. python.exe is killed when py.exe exits) and for silent breakaway of child processes. In this case, prior to Windows 8 (the first version to support nested Job objects), assigning the current process to a new Job will fail, so you'll have to run python.exe directly, or use a child process via subprocess. I prefer the former, since a child process won't be tethered to the launcher, which could get ugly for console applications. import ctypes import winerror, win32api, win32job kernel32 = ctypes.WinDLL('kernel32', use_last_error=True) MEM_COMMIT = 0x1000 MEM_RELEASE = 0x8000 PAGE_READWRITE = 4 kernel32.VirtualAlloc.restype = ctypes.c_void_p kernel32.VirtualAlloc.argtypes = (ctypes.c_void_p, ctypes.c_size_t, ctypes.c_ulong, ctypes.c_ulong) kernel32.VirtualFree.argtypes = (ctypes.c_void_p, ctypes.c_size_t, ctypes.c_ulong) hjob = win32job.CreateJobObject(None, "") limits = win32job.QueryInformationJobObject(hjob, win32job.JobObjectExtendedLimitInformation) limits['BasicLimitInformation']['LimitFlags'] |= ( win32job.JOB_OBJECT_LIMIT_PROCESS_MEMORY) limits['ProcessMemoryLimit'] = 2**31 win32job.SetInformationJobObject(hjob, win32job.JobObjectExtendedLimitInformation, limits) addr0 = kernel32.VirtualAlloc(None, 2**31, MEM_COMMIT, PAGE_READWRITE) if addr0: mem0_released = kernel32.VirtualFree(addr0, 0, MEM_RELEASE) win32job.AssignProcessToJobObject(hjob, win32api.GetCurrentProcess()) addr1 = kernel32.VirtualAlloc(None, 2**31, MEM_COMMIT, PAGE_READWRITE) Result: >>> addr0 2508252315648 >>> mem0_released 1 >>> addr1 is None True >>> ctypes.get_last_error() == winerror.ERROR_COMMITMENT_LIMIT True ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On Mon, Mar 27, 2017 at 8:52 PM, Barrywrote: > I took to using > > chcp 65001 > > This puts cmd.exe into unicode mode. conhost.exe hosts the console, and chcp.com is a console app that calls GetConsoleCP, SetConsoleCP and SetConsoleOutputCP to show or modify the console's input and output codepages. It doesn't support changing them separately. cmd.exe is just another console client, no different from python.exe or powershell.exe in this regard. Also, it's unrelated to how Python uses the console, but for the record, cmd has used the console's wide-character API since it was ported from OS/2 in the early 90s. Back then the console was hosted using threads in the csrss.exe system process, which made sense because the windowing system was hosted there. When they moved most of the window manager to kernel mode in NT 4 (1996), the console was mostly left behind in csrss.exe. It wasn't until Windows 7 that it found a new home in conhost.exe. In Windows 8 it got a real device driver instead of using fake file handles. In Windows 10 it was updated to be less of a franken-window -- e.g. now it has line-wrapped selection and text reflowing. Using codepage 65001 (UTF-8) in a console app has a couple of annoying bugs in the console itself, and another due to flushing of C FILE streams. For example, reading text that has even a single non-ASCII character will fail because conhost's encoding buffer is too small. It handles the error by returning a read of 0 bytes. That's EOF, so Python's REPL quits; input() raises EOFError; and stdin.read() returns an empty string. Microsoft should fix this in Windows 10, and probably will eventually. The Linux subsystem needs UTF-8, and it's silly that the console doesn't allow entering non-ASCII text in Linux programs. As was already recommended, I suggest using the wide-character API via win_unicode_console in 2.7 and 3.5. In 3.6 we get the wide-character API automatically thanks to Steve Dower's io._WindowsConsoleIO class. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Using Python for end user applications
On Tue, Feb 7, 2017 at 3:27 PM, Paul Moorewrote: > On 7 February 2017 at 14:29, Steve Dower wrote: >> You can leave python.exe out of your distribution to avoid it showing up on >> PATH, or if your stub explicitly LoadLibrary's vcruntime140.dll and then >> python36.dll you should be able to put them wherever you like. > > Understood, but I may need python.exe present if the script uses > multiprocessing, so I'm trying to avoid doing that (while I'm doing > things manually, I can do what I like, obviously, but a generic "build > my app" tool has to be a bit more cautious). > > LoadLibrary might work (I'm only calling Py_Main). I seem to recall > trying this before and having issues but that might have been an > earlier iteration which made more complex use of the C API. Also, I > want to load python3.dll (the stable ABI) as I don't want to have to > rebuild the stub once for each Python version, or have to search for > the correct DLL in C. But I'll definitely give that a go. LoadLibrary and GetProcAddress will work, but that would get tedious if a program needed a lot of Python's API. It's also a bit of a kludge having to manually call LoadLibrary with a given DLL order. For the latter, I wish we could simply load python3.dll using LoadLibraryEx with LOAD_WITH_ALTERED_SEARCH_PATH, but it doesn't work in good old Windows 7. python3.dll doesn't depend on python3x.dll in its DLL import table. I discovered in issue 29399 that in this case the loader in Windows 7 doesn't use the altered search path of python3.dll to load python3x.dll and vcruntime140.dll. As you're currently doing (as we discussed last September), creating an assembly in a subdirectory works in all supported Windows versions, and it's the most convenient way to access all of Python's limited API. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Is it Python 3 yet?
On Thu, Jan 26, 2017 at 10:49 PM, Paul Moorewrote: > On 26 January 2017 at 22:32, M.-A. Lemburg wrote: >> On 26.01.2017 23:09, Random832 wrote: >>> On Thu, Jan 26, 2017, at 11:21, Paul Moore wrote: On a similar note, I always get caught out by the fact that the Windows default download is the 32-bit version. Are we not yet at a point where a sufficient majority of users have 64-bit machines, and 32-bit should be seen as a "specialist" choice? >>> >>> I'm actually surprised it doesn't detect it, especially since it does >>> detect Windows. >>> >>> (I bet fewer people have supported 32-bit windows versions than have >>> Windows XP.) >> >> I think you have to differentiate a bit more between having a >> 64-bit OS and running 64-bit applications. >> >> Many applications on Windows are still 32-bit applications and >> unless you process large amounts of data, a 32-bit Python >> system is well worth using. In some cases, it's even needed, >> e.g. if you have to use an extension which links to a 32-bit >> library. > > I agree that there are use cases for a 32-bit Python. But for the > *average* user, I'd argue in favour of a 64-bit build as the default > download. Preferring the 64-bit version would be a friendlier experience for novices in general nowadays. I've had to explain WOW64 file-system redirection [1] and registry redirection [2] too many times to people who are using 32-bit Python on 64-bit Windows. I've seen people waste over a day on this silly problem. They can't imagine that Windows is basically lying to them. [1]: https://msdn.microsoft.com/en-us/library/aa384187 [2]: https://msdn.microsoft.com/en-us/library/aa384232 ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggestion: Clear screen command for the REPL
On Tue, Oct 4, 2016 at 2:22 PM, Random832wrote: > On Wed, Sep 28, 2016, at 23:36, Chris Angelico wrote: >> On Thu, Sep 29, 2016 at 12:04 PM, Steven D'Aprano >> wrote: >> > (Also, it seems a shame that Ctrl-D is EOF in Linux and Mac, but Windows >> > is Ctrl-Z + Return. Can that be standardized to Ctrl-D everywhere?) >> >> Sadly, I suspect not. If you're running in the default Windows >> terminal emulator (the one a normal user will get by invoking >> cmd.exe), you're running under a lot of restrictions, and I believe >> one of them is that you can't get Ctrl-D without an enter. > > Well, we could read _everything_ in character-at-a-time mode, and > implement our own line editing. In effect, that's what readline is > doing. 3.6+ switched to calling ReadConsoleW, which allows using a 32-bit control mask to indicate which ASCII control codes should terminate a read. The control character is left in the input string, so it's possible to define custom behavior for multiple control characters. Here's a basic ctypes example of how this feature works. In each case, after calling ReadConsoleW I enter "spam" and then type a control character to terminate the read. import sys import msvcrt import ctypes kernel32 = ctypes.WinDLL('kernel32', use_last_error=True) ReadConsoleW = kernel32.ReadConsoleW CTRL_MASK = 2 ** 32 - 1 # all ctrl codes hin = msvcrt.get_osfhandle(sys.stdin.fileno()) buf = (ctypes.c_wchar * 10)(*('-' * 10)) pn = (ctypes.c_ulong * 1)() ctl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0) >>> # Ctrl+2 or Ctrl+@ (i.e. NUL) ... ret = ReadConsoleW(hin, buf, 10, pn, ctl); print() spam >>> buf[:] 'spam\x00-' >>> # Ctrl+D ... ret = ReadConsoleW(hin, buf, 10, pn, ctl); print() spam >>> buf[:] 'spam\x04-' >>> # Ctrl+[ ... ret = ReadConsoleW(hin, buf, 10, pn, ctl); print() spam >>> buf[:] 'spam\x1b-' This could be used to implement Ctrl+D and Ctrl+L support in PyOS_Readline. Supporting Ctrl+L to work like GNU readline wouldn't be a trivial one-liner, but it's doable. It has to clear the screen and also write the input (except the Ctrl+L) back to the input buffer. > The main consequence of reading everything in character-at-a-time mode > is that we'd have to implement everything ourselves, and the line > editing you get *without* doing it yourself is somewhat nicer on Windows > than on Linux (it supports cursor movement, inserting characters, and > history). Line-input mode also supports F7 for a history popup window to select a previous command; Ctrl+F to search the screen text; text selection (e.g. shift+arrows or Ctrl+A); copy/paste via Ctrl+C and Ctrl+V (or Ctrl+Insert and Shift+Insert); and parameterized input aliases ($1-$9 and $* for parameters). https://technet.microsoft.com/en-us/library/mt427362 https://technet.microsoft.com/en-us/library/cc753867 >> "Bash on Ubuntu on windows" responds to CTRL+D just fine. I don't really >> know how it works, but it looks like it is based on the Windows terminal >> emulator. > > It runs inside it, but it's using the "Windows Subsystem for Linux", > which (I assume) reads character-at-a-time and feeds it to a Unix-like > terminal driver, (which Bash then has incidentally also put in > character-at-a-time mode by using readline - to see what you get on WSL > *without* doing this, try running "cat" under bash.exe) Let's take a look at how WSL modifies the console's global state. Here's a simple function to print the console's input and output modes and codepages, which we can call in the background to monitor the console state: def report(): hin = msvcrt.get_osfhandle(0) hout = msvcrt.get_osfhandle(1) modeIn = (ctypes.c_ulong * 1)() modeOut = (ctypes.c_ulong * 1)() kernel32.GetConsoleMode(hin, modeIn) kernel32.GetConsoleMode(hout, modeOut) cpIn = kernel32.GetConsoleCP() cpOut = kernel32.GetConsoleOutputCP() print('\nmodeIn=%x, modeOut=%x, cpIn=%d, cpOut=%d' % (modeIn[0], modeOut[0], cpIn, cpOut)) def monitor(): report() t = threading.Timer(10, monitor, ()) t.start() >>> monitor(); subprocess.call('bash.exe') modeIn=f7, modeOut=3, cpIn=437, cpOut=437 ... modeIn=2d8, modeOut=f, cpIn=65001, cpOut=65001 See the following page for a description of the mode flags: https://msdn.microsoft.com/en-us/library/ms686033 The output mode changed from 0x3 to 0xf, enabling DISABLE_NEWLINE_AUTO_RETURN (0x8) ENABLE_VIRTUAL_TERMINAL_PROCESSING (0x4) The input mode changed from 0xf7 to 0x2d8, enabling ENABLE_VIRTUAL_TERMINAL_INPUT (0x200) ENABLE_WINDOW_INPUT (0x8, probably for SIGWINCH) and disabling ENABLE_INSERT_MODE (0x20) ENABLE_ECHO_INPUT (0x4) ENABLE_LINE_INPUT (0x2) ENABLE_PROCESSED_INPUT (0x1) So you're correct that it's basically using a
Re: [Python-ideas] Suggestion: Clear screen command for the REPL
On Thu, Sep 29, 2016 at 7:08 AM, Stephan Houbenwrote: > > I just tried with this official Python binary: > Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC v.1900 32 bit > (Intel)] on win32 > > and CTRL-L for sure does clear the window. It just doesn't then move the > prompt to the top, so you end up with a bunch of empty lines, followed by > the prompt. You probably have pyreadline installed. It calls ReadConsoleInputW to read low-level input records, bypassing the console's normal cooked read. See the following file that defines the key binding: https://github.com/pyreadline/pyreadline/blob/1.7/pyreadline/configuration/pyreadlineconfig.ini#L18 Unfortunately pyreadline is broken for non-ASCII input. It ignores the Alt+Numpad record sequences used for non-ASCII characters. Without having to implement readline module for Windows (personally, I don't use it), support for Ctrl+L can be added relatively easily in 3.6+. ReadConsoleW takes a parameter to specify a mask of ASCII control characters that terminate a read. The control character is left in the buffer, so code just has to be written that looks for various control characters to implement features such as a Ctrl+L clear screen. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggestion: Clear screen command for the REPL
On Mon, Sep 19, 2016 at 1:12 PM, Paul Moorewrote: > By the way - if you're on a system with readline support included with > Python, GNU readline apparently has a binding for clear-screen > (CTRL-L) so you may well have this functionality already (I don;'t use > Unix or readline, so I can't comment for sure). Hooking Ctrl+L to clear the screen can be implemented for Windows Vista and later via the ReadConsole pInputControl parameter, as called by PyOS_StdioReadline. It should be possible to match how GNU readline works -- i.e. clear the screen, reprint the prompt, flush the input buffer, and write the current line's input back to the input buffer. The pInputControl parameter can also be used to implement Unix-style Ctrl+D to end a read anywhere on a line, whereas the classic [Ctrl+Z][Enter] has to be entered at the start of a line. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggestion: Clear screen command for the REPL
On Sat, Sep 17, 2016 at 1:15 PM, Wes Turnerwrote: > !cls #windows cmd's built-in cls command doesn't clear just the screen, like a VT100 \x1b[1J. It clears the console's entire scrollback buffer. Unix `clear` may also work like that. With GNOME Terminal in Linux, `clear` leaves a single screen in the scrollback buffer. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding optional parameter to shutil.rmtree to not delete root.
On Thu, Aug 25, 2016 at 2:29 AM, Nick Jacobson via Python-ideaswrote: > I've been finding that a common scenario is where I want to remove > everything in a directory, but leave the (empty) root directory behind, not > removing it. > > So for example, if I have a directory C:\foo and it contains subdirectory > C:\foo\bar and file C:\foo\myfile.txt, and I want to remove the subdirectory > (and everything in it) and file, leaving only C:\foo behind. > > (This is useful e.g. when the root directory has special permissions, so it > wouldn't be so simple to remove it and recreate it again.) Here's a Windows workaround that clears the delete disposition after rmtree 'deletes' the directory. A Windows file or directory absolutely cannot be unlinked while there are handle or kernel references to it, and a handle with DELETE access can set and unset the delete disposition. This used to require the system call NtSetInformationFile, but Vista added SetFileInformationByHandle to the Windows API. import contextlib import ctypes import _winapi kernel32 = ctypes.WinDLL('kernel32', use_last_error=True) kernel32.SetFileInformationByHandle # Vista minimum (NT 6.0+) DELETE = 0x0001 SHARE_ALL = 7 OPEN_EXISTING = 3 BACKUP = 0x0200 FileDispositionInfo = 4 @contextlib.contextmanager def protect_file(path): hFile = _winapi.CreateFile(path, DELETE, SHARE_ALL, 0, OPEN_EXISTING, BACKUP, 0) try: yield if not kernel32.SetFileInformationByHandle( hFile, FileDispositionInfo, (ctypes.c_ulong * 1)(0), 4): raise ctypes.WinError(ctypes.get_last_error()) finally: kernel32.CloseHandle(hFile) For example: >>> os.listdir('test') ['dir1', 'dir2', 'file'] >>> with protect_file('test'): ... shutil.rmtree('test') ... >>> os.listdir('test') [] Another example: >>> open('file', 'w').close() >>> with protect_file('file'): ... os.remove('file') ... >>> os.path.exists('file') True ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] discontinue iterable strings
On Sun, Aug 21, 2016 at 6:34 AM, Michael Selikwrote: > The detection of not hashable via __hash__ set to None was necessary, but > not desirable. Better to have never defined the method/attribute in the > first place. Since __iter__ isn't present on ``object``, we're free to use > the better technique of not defining __iter__ rather than defining it as > None, NotImplemented, etc. This is superior, because we don't want __iter__ > to show up in a dir(), help(), or other tools. The point is to be able to define __getitem__ without falling back on the sequence iterator. I wasn't aware of the recent commit that allows anti-registration of __iter__. This is perfect: >>> class C: ... __iter__ = None ... def __getitem__(self, index): return 42 ... >>> iter(C()) Traceback (most recent call last): File "", line 1, in TypeError: 'C' object is not iterable >>> isinstance(C(), collections.abc.Iterable) False ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] discontinue iterable strings
On Sun, Aug 21, 2016 at 5:27 AM, Chris Angelicowrote: > Hmm. It would somehow need to be recognized as "not iterable". I'm not > sure how this detection is done; is it based on the presence/absence > of __iter__, or is it by calling that method and seeing what comes > back? If the latter, then sure, an __iter__ that raises would cover > that. PyObject_GetIter calls __iter__ (i.e. tp_iter) if it's defined. To get a TypeError, __iter__ can return an object that's not an iterator, i.e. an object that doesn't have a __next__ method (i.e. tp_iternext). For example: >>> class C: ... def __iter__(self): return self ... def __getitem__(self, index): return 42 ... >>> iter(C()) Traceback (most recent call last): File "", line 1, in TypeError: iter() returned non-iterator of type 'C' If __iter__ isn't defined but __getitem__ is defined, then PySeqIter_New is called to get a sequence iterator. >>> class D: ... def __getitem__(self, index): return 42 ... >>> it = iter(D()) >>> type(it) >>> next(it) 42 ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On Thu, Aug 18, 2016 at 3:25 PM, Steve Dowerwrote: > allow us to change locale.getpreferredencoding() to utf-8 on Windows _bootlocale.getpreferredencoding would need to be hard coded to return 'utf-8' on Windows. _locale._getdefaultlocale() itself shouldn't return 'utf-8' as the encoding because the CRT doesn't allow it as a locale encoding. site.aliasmbcs() uses getpreferredencoding, so it will need to be modified. The codecs module could add get_acp and get_oemcp functions based on GetACP and GetOEMCP, returning for example 'cp1252' and 'cp850'. Then aliasmbcs could call get_acp. Adding get_oemcp would also help with decoding output from subprocess.Popen. There's been discussion about adding encoding and errors options to Popen, and what the default should be. When writing to a pipe or file, some programs use OEM, some use ANSI, some use the console codepage if available, and far fewer use Unicode encodings. Obviously it's better to specify the encoding in each case if you know it. Regarding the locale module, how about modernizing _locale._getdefaultlocale to return the Windows locale name [1] from GetUserDefaultLocaleName? For example, it could return a tuple such as ('en-UK', None) and ('uz-Latn-UZ', None) -- always with the encoding set to None. The CRT accepts the new locale names, but it isn't quite up to speed. It still sets a legacy locale when the locale string is empty. In this case the high-level setlocale could call _getdefaultlocale. Also _parse_localename, which is called by getlocale, needs to return a tuple with the encoding as None. Currently it raises a ValueError for Windows locale names as defined by [1]. [1]: https://msdn.microsoft.com/en-us/library/dd373814 ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On Thu, Aug 18, 2016 at 2:32 AM, Stephen J. Turnbullwrote: > > So it's not just invalid surrogate *pairs*, it's invalid surrogates of > all kinds. This means that it's theoretically possible (though I > gather that it's unlikely in the extreme) for a real Windows filename > to indistinguishable from one generated by Python's surrogateescape > handler. Absolutely if the filesystem is one of Microsoft's such as NTFS, FAT32, exFAT, ReFS, NPFS (named pipes), MSFS (mailslots) -- and I'm pretty sure it's also possible with CDFS and UDFS. UDF allows any Unicode character except NUL. > What happens when Python's directory manipulation functions on Windows > encounter such a filename? Do they try to write it to the disk > directory? Do they succeed? Does that depend on surrogateescape? Python allows these 'Unicode' (but not strictly UTF compatible) strings, so it doesn't have a problem with such filenames, as long as it's calling the Windows wide-character APIs. > Is there a reason in practice to allow surrogateescape at all on names > in Windows filesystems, at least when using the *W API? You mention > non-Microsoft filesystems; are they common enough to matter? Previously I gave an example with a VirtualBox shared folder, which rejects names with invalid surrogates. I don't know how common that is in general. I typically switch between 2 guests on a Linux host and share folders between systems. In Windows I mount shared folders as directory symlinks in C:\Mount. I just tested another example that led to different results. Ext2Fsd is a free ext2/ext3 filesystem driver for Windows. I mounted an ext2 disk in Windows 10. Next, in Python I created a file named "\udc00b\udc00a\udc00d" in the root directory. Ext2Fsd defaults to using UTF-8 as the drive codepage, so I expected it to reject this filename, just like VBoxSF does. But it worked: >>> os.listdir('.')[-1] '\udc00b\udc00a\udc00d' As expected the ANSI API substitutes question marks for the surrogate codes: >>> os.listdir(b'.')[-1] b'?b?a?d' So what did Ext2Fsd write in this supposedly UTF-8 filesystem? I mounted the disk in Linux to check: >>> os.listdir(b'.')[-1] b'\xed\xb0\x80b\xed\xb0\x80a\xed\xb0\x80d' It blindly encoded the surrogate codes, creating invalid UTF-8. I think it's called WTF-8 (Wobbly Transformation Format). The file manager in Linux displays this file as "���b���a���d (invalid encoding)", and ls prints "???b???a???d". Python uses its surrogateescape error handler: >>> os.listdir('.')[-1] '\udced\udcb0\udc80b\udced\udcb0\udc80a\udced\udcb0\udc80d' The original name can be decoded using the surrogatepass error handler: >>> os.listdir(b'.')[-1].decode(errors='surrogatepass') '\udc00b\udc00a\udc00d' ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
>> On Mon, Aug 15, 2016 at 6:26 PM, Steve Dower>> wrote: > > and using the *W APIs exclusively is the right way to go. My proposal was to use the wide-character APIs, but transcoding CP_ACP without best-fit characters and raising a warning whenever the default character is used (e.g. substituting Katakana middle dot when creating a file using a bytes path that has an invalid sequence in CP932). This proposal was in response to the case made by Stephen Turnbull. If using UTF-8 is getting such heavy pushback, I thought half a solution was better than nothing, and it also sets up the infrastructure to easily switch to UTF-8 if that idea eventually gains acceptance. It could raise exceptions instead of warnings if that's preferred, since bytes paths on Windows are already deprecated. > *Any* encoding that may silently lose data is a problem, which basically > leaves utf-16 as the only option. However, as that causes other problems, > maybe we can accept the tradeoff of returning utf-8 and failing when a > path contains invalid surrogate pairs Are there any common sources of illegal UTF-16 surrogates in Windows filenames? I see that WTF-8 (Wobbly) was developed to handle this problem. A WTF-8 path would roundtrip back to the filesystem, but it should only be used internally in a program. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On Thu, Aug 11, 2016 at 9:07 AM, Paul Moorewrote: > set codepage to UTF-8 > ... > set codepage back > spawn subprocess X, but don't wait for it > set codepage to UTF-8 > ... > ... At this point what codepage does Python see? What codepage does > process X see? (Note that they are both sharing the same console). The input and output codepages are global data in conhost.exe. They aren't tracked for each attached process (unlike input history and aliases). That's how chcp.com works in the first place. Otherwise its calls to SetConsoleCP and SetConsoleOutputCP would be pointless. But IMHO all talk of using codepage 65001 is a waste of time. I think the trailing garbage output with this codepage in Windows 7 is unacceptable. And getting EOF for non-ASCII input is a show stopper. The problem occurs in conhost. All you get is the EOF result from ReadFile/ReadConsoleA, so it can't be worked around. This kills the REPL and raises EOFError for input(). ISTM the only people who think codepage 65001 actually works are those using Windows 8+ who occasionally need to print non-OEM text and never enter (or paste) anything but ASCII text. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On Wed, Aug 10, 2016 at 11:30 PM, Random832wrote: > Er... utf-8 doesn't work reliably with arbitrary bytes paths either, > unless you intend to use surrogateescape (which you could also do with > mbcs). > > Is there any particular reason to expect all bytes paths in this > scenario to be valid UTF-8? The problem is more so that data is lost without an error when using the legacy ANSI API. If the path is invalid UTF-8, Python will at least raise an exception when decoding it. To work around this, the developers may decide they need to just bite the bullet and use Unicode, or maybe there could be legacy Latin-1 and ANSI modes enabled by an environment variable or sys flag. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On Wed, Aug 10, 2016 at 8:09 PM, Random832wrote: > On Wed, Aug 10, 2016, at 15:22, Steve Dower wrote: >> >> Allowing library developers who support POSIX and Windows to just use >> bytes everywhere to represent paths. > > Okay, how is that use case impacted by it being mbcs instead of utf-8? Using 'mbcs' doesn't work reliably with arbitrary bytes paths in locales that use a DBCS codepage such as 932. If a sequence is invalid, it gets passed to the filesystem as the default Unicode character, so it won't successfully roundtrip. In the following example b"\x81\xad", which isn't defined in CP932, gets mapped to the codepage's default Unicode character, Katakana middle dot, which encodes back as b"\x81E": >>> locale.getpreferredencoding() 'cp932' >>> open(b'\x81\xad', 'w').close() >>> os.listdir('.') ['・'] >>> unicodedata.name(os.listdir('.')[0]) 'KATAKANA MIDDLE DOT' >>> '・'.encode('932') b'\x81E' This isn't a problem for single-byte codepages, since every byte value uniquely maps to a Unicode code point, even if it's simply b'\x81' => u"\x81". Obviously there's still the general problem of dealing with arbitrary Unicode filenames created by other programs, since the ANSI API can only return a best-fit encoding of the filename, which is useless for actually accessing the file. >> It probably also entails opening the file descriptor in bytes mode, >> which might break programs that pass the fd directly to CRT functions. >> Personally I wish they wouldn't, but it's too late to stop them now. > > The only thing O_TEXT does rather than O_BINARY is convert CRLF line > endings (and maybe end on ^Z), and I don't think we even expose the > constants for the CRT's unicode modes. Python 3 uses O_BINARY when opening files, unless you explicitly call os.open. Specifically, FileIO.__init__ adds O_BINARY to the open flags if the platform defines it. The Windows CRT reads the BOM for the Unicode modes O_WTEXT, O_U16TEXT, and O_U8TEXT. For O_APPEND | O_WRONLY mode, this requires opening the file twice, the first time with read access. See configure_text_mode() in "Windows Kits\10\Source\10.0.10586.0\ucrt\lowio\open.cpp". Python doesn't expose or use these Unicode text-mode constants. That's for the best because in Unicode mode the CRT invokes the invalid parameter handler when a buffer doesn't have an even number of bytes, i.e. a multiple of sizeof(wchar_t). Python could copy how configure_text_mode() handles the BOM, except it shouldn't write a BOM for new UTF-8 files. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/