[Python-Dev] Re: Small lament...
On 4/1/23, Skip Montanaro wrote: > Just wanted to throw this out there... I lament the loss of waking up on > April 1st to see a creative April Fool's Day joke on one or both of these > lists, often from our FLUFL... Maybe such frivolity still happens, just not > in the Python ecosystem? I thought this one was funny: https://github.com/python/cpython/issues/103172 ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/G2Z7QUZA4E3J6BE7HLIWM6R3FWHFYWTV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Debugging of native extensions on windows
On 3/13/23, Rokas Kupstys wrote: > I eventually stumbled on to process list showing > ".venv/Scripts/python.exe" having spawned a subprocess... Which led me > to "PC/launcher.c" which is what ".venv/Scripts/python.exe" really is. For a standard Python installation, you can create a virtual environment with the --symlinks option instead of the default configuration that uses the venv launcher. Note, however, that using symlinks doesn't work with the store app distribution of Python. If your system doesn't have developer mode enabled, creating symlinks requires "SeCreateSymbolicLinkPrivilege". By default this privilege is only granted to administrators. However, an administrator can use the management console "secpol.msc" snap-in to grant the symlink privilege directly to a user account, or to one of the account's default enabled groups such as "Authenticated Users". Add the user or group to the "Create symbolic links" policy in "Security Settings" -> "Local Policies" -> "User Rights Assignment". You'll have to log off and back on again to get a new access token that has the symlink privilege. Unfortunately, the shell API -- e.g. os.startfile() -- resolves the final path of an executable before running it. This allows using filesystem symlinks as if they're shortcuts (LNK files), but it prevents using a symlink to change the name or path of an executable to get different expected behavior, such as a Python virtual environment that uses symlinks. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3PJJDU6WVNV7K65RZEDMBERCCAVIS5P6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: glob's new include_hidden parameter
On 9/12/22, Mats Wichmann wrote: > > If `include_hidden` is true, the patterns '*', '?', '**' will > match hidden directories. Shouldn't this explain what a "hidden directory" is? For example, a Windows user may think this means a directory with FILE_ATTRIBUTE_HIDDEN set, but that's not what's meant here. Also, I think it should note that enabling include_hidden negates the earlier claim that "files beginning with a dot (.) can only be matched by patterns that also start with a dot". For example, glob.glob('*', include_hidden=True) includes all of the conventionally hidden directories and hidden files in the current directory. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VGEQQGHJDI2JMQ2SO6V6ULBUBNNTKDA2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Add -P command line option to not add sys.path[0]
On 4/26/22, Victor Stinner wrote: > > There are 4 main ways to run Python: > > (1) python -m module [...] > (2) python script.py [...] > (3) python -c code [...] > (4) python [...] > > (1) and (2) insert the directory of the module/script at sys.path[0]. Running a module with -m inserts the current working directory (the path, not an empty string) at sys.path[0], followed by the module directory at sys.path[1]. Only one entry is added if they're the same directory. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/63CBL373SWD7P24TMQOHCJYDP76J4NTL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Restrict the type of __slots__
On 3/19/22, Eryk Sun wrote: > On 3/18/22, Ronald Oussoren via Python-Dev wrote: >> >> - if __slots__ is a dict keep it as is >> - Otherwise use tuple(__slots__) while constructing the class and store >> that >> value in the __slots__ attribute of the class > > If this is just for pydoc, then it can be updated with new behavior. > For example, if the given __slots__ is a dict, set it as something > like __slots_doc__, and rewrite __slots__ as a tuple of the keys. Or extend pydoc to support a mappingproxy for __slots__. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NPIZQIZ5OPYU6V7YIGK5TZ75Z6WIOK5O/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Restrict the type of __slots__
On 3/18/22, Ronald Oussoren via Python-Dev wrote: > > - if __slots__ is a dict keep it as is > - Otherwise use tuple(__slots__) while constructing the class and store that > value in the __slots__ attribute of the class If this is just for pydoc, then it can be updated with new behavior. For example, if the given __slots__ is a dict, set it as something like __slots_doc__, and rewrite __slots__ as a tuple of the keys. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Y6RDNJZGO4FN3DCNTWJM7YA7WFLKIXDV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Improvements to the sys.path initialization documentation
On 3/4/22, Victor Stinner wrote: > it would be nice to move the last bits of the sys.path initialization > from the site module to the getpath module. It's unpleasant to > have a different sys.path depending if the site module is loaded > or not. I don't understand. The site packages directories, including virtual environments, are a site extension. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5ZO73YHNL3BHXY4MHRITOGOECL2SZKPO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)
On 11/13/21, Terry Reedy wrote: > On 11/13/2021 4:35 PM, pt...@austin.rr.com wrote: >> >> _퓟Ⅼ햠홲험ℋ풪Lᴰ푬핽﹏핷피헡 = 12 >> >> def _픰ʰ퓸ʳ핥홚푛(픰, p푟픢fi햝핝횎푛, sᵤ푓헳헂푥헹ₑ횗): >> >> ˢ헸i헽 = 퐥e혯(햘) - pr횎햋퐢x헅ᵉ퓷 - 풔홪ffi혅헹홚ₙ >> >> if ski혱 > _퐏헟햠혊홴H핺L핯홀혙﹏L픈풩: >> >> 혴 = '%s[%d chars]%s' % (홨[:혱퐫핖푓핚xℓ풆핟], ₛ횔풊p, 퓼[퓁풆햓(횜) - >> 홨횞풇fix홡ᵉ혯:]) >> >> return ₛ >> > * Does not at all work in CommandPrompt It works for me when pasted into the REPL using the console in Windows 10. I pasted the code into a raw multiline string assignment and then executed the string with exec(). The only issue is that most of the pasted characters are displayed using the font's default glyph since the console host doesn't have font fallback support. Even Windows Terminal doesn't have font fallback support yet in the command-line editing mode that Python's REPL uses. But Windows Terminal does implement font fallback for normal output rendering, so if you assign the pasted text to string `s`, then print(s) should display properly. > even after supposedly changing to a utf-8 codepage with 'chcp 65000'. Changing the console code page is unnecessary with Python 3.6+, which uses the console's wide-character API. Also, even though it's irrelevant for the REPL, UTF-8 is code page 65001. Code page 65000 is UTF-7. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7FGNJ7TMASDOMQAS2LSSQAD2PPURT5W6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: What is __int__ still useful for?
On 10/15/21, Mark Dickinson wrote: > > the proposal would be to remove that special role of `__trunc__` and > reduce the `int` constructor to only looking at `__int__` and `__index__`. For Real and Rational numbers, currently the required method to implement is __trunc__(). ISTM that this proposal should include a change to require __int__() in numbers.Real. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/R3JI6EGLMMBDNPCKFSNPLUNR2Q3ISAID/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: What is __int__ still useful for?
On 10/14/21, Antoine Pitrou wrote: > On Wed, 13 Oct 2021 17:00:49 -0700 > Guido van Rossum wrote: >> >> so int() can't call __trunc__ (as was explained earlier in >> the thread). I guess this was meant to be "*just* call __trunc__". It's documented that the int constructor calls the initializing object's __trunc__() method if the object doesn't implement __int__() or __index__(). > Note that PyNumber_Long() is now the only place inside the interpreter > calling the `nb_int` slot. But since it also has those undesirable code > paths accepting str and buffer-like objects, it's usable in fewer > situations than you'd expect. Maybe an alternate constructor could be added -- such as int.from_number() -- which would be restricted to calling __int__(), __index__(), and __trunc__(). ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Q77PFIMCHDGB36LZTNMFG6NF7DE2UOSF/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Have virtual environments led to neglect of the actual environment?
On 2/28/21, Oscar Benjamin wrote: > > Oh, okay. So does that mean that it's always on PATH unless the user > *explicitly unticks* the "install the launcher" box for both single > user and all user installs? If the launcher gets installed, it will be available in PATH. IIRC, the installer only allows installing the launcher for the current user if it is not installed already for all users. Thus only one version should ever exist in PATH, which is set in either the user or system "PATH" value in the registry, but not both. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3QWZRQSI5BTM3GQRWNH2UQN4L7RONAPJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Have virtual environments led to neglect of the actual environment?
On 2/28/21, Oscar Benjamin wrote: > > - It is possible to configure a default version (although I think you > have to do it with an environment variable) The py launcher in Windows supports a "py.ini" file beside the executable and in %LocalAppData%. The equivalent of the PY_PYTHON, PY_PYTHON2, and PY_PYTHON3 environment variables can be set in the "[defaults]" section as "python", "python2", and "python3" settings. The ini file also supports a "[commands]" section to define additional virtual commands for shebangs. Regular filepaths are also supported in shebangs -- e.g. #!"path\to\any\program.exe". > - I think that the launcher is only installed in an all users install. It defaults to an all-users install, but it can also be installed for just the current user in "%LocalAppData%\Programs\Python\Launcher". In this case, the installation directory always gets added to PATH. > - Listing installations with "py -0p" is somewhat cryptic There's also the long-form options `--list` and `--list-paths`. > - It would be better if you could use the launcher itself to set the > default Python e.g. "py --set-default-python=3.8" It's pretty simply to run `set PY_PYTHON=3.8`, and persist the value in the registry with `setx.exe PY_PYTHON 3.8`. (But don't use setx.exe naively to set PATH.) > On the last point I think that although Anaconda doesn't install the > launcher you can use the launcher to run the python from the Anaconda > installation. I don't use Anaconda, but I don't think that's supposed to be the case according to PEP 514. The launcher only looks for PSF development distributions in the "PythonCore" registry key. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/JPQLBEYVWV2XGXO44JJ4YPN3KFJJYN2Q/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 597: Add optional EncodingWarning
On 2/11/21, Inada Naoki wrote: > > There is little difference between `encoding=None` and > `encoding=locale.getpreferredencoding(False)`. The difference is: > > * When Python is using Windows, and > * When when the file is console, and > * (for open()) When PYTHONLEGACYWINDOWSSTDIO is set > * (for TextIOWrapper()) When the file is not _WindowsConsoleIO > > encoding=None uses console codepage but os.device_encoding() -- i.e. _Py_device_encoding() -- only works for hard-coded file descriptors 0, 1, and 2, instead of detecting a console file. So opening "CON", "CONIN$", or "CONOUT$" has never used the console input or output code page, nor has opening a duped standard I/O fd such as open(os.dup(0)). It would be easy to generalize _Py_device_encoding() to detect console files, but it's new behavior. Python 3.8+ introduced a bug (issue 42261) in which, even with legacy standard I/O enabled and file descriptors 0-2, the console input and output code pages are ignored. For example: C:\>chcp 437 Active code page: 437 C:\>set PYTHONLEGACYWINDOWSSTDIO=1 C:\>py -3.9 -c "import sys; print(sys.stdout.encoding)" cp1252 Regarding the last bullet point, io.TextIOWrapper doesn't know anything about io._WindowsConsoleIO. The decision to use UTF-8 is in io.open(). So manually wrapping a _WindowsConsoleIO file with TextIOWrapper uses the locale preferred encoding instead of UTF-8. For example: >>> fb = open('conin$', 'rb') >>> fb.raw <_io._WindowsConsoleIO mode='rb' closefd=True> >>> f = io.TextIOWrapper(fb) >>> f.encoding 'cp1252' I don't know whether it's worth making TextIOWrapper check for _WindowsConsoleIO in order to make it use UTF-8. It's not common to manually wrap a binary-mode file. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QBNH3XGSNBQ7XIJ5E542JIQ5Q5E63MCU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: os.scandir bug in Windows?
On 10/28/20, Stephen J. Turnbull wrote: > > Note: you can "fix" directory updates by mounting the filesystem r/o. Mounting the filesystem as readonly is the extreme case. Popular Unix systems support a "noatime" mount option that disables updating file access times, unless one of the other timestamps changes. In Windows, NTFS and ReFS support a system setting (but not per-volume) to disable updating access times -- "NtfsDisableLastAccessUpdate" and "RefsDisableLastAccessUpdate". ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/E5AWEB3U5ZCQBWABOKAGL6CADRHBLEEP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: os.scandir bug in Windows?
On 10/26/20, Victor Stinner wrote: > Le lun. 19 oct. 2020 à 13:50, Steve Dower a écrit > : >> Feel free to file a bug, but we'll likely only add a vague note to the >> docs about how Windows works here rather than changing anything. > > I agree that this surprising behavior can be documented. Attempting to > provide accurate access time in os.scandir() is likely to slow-down > the function which would defeat its whole purpose. I don't think the access time (st_atime) is a significant concern. I'm concerned with the reliability of the file size (st_size) and last-write time (st_mtime) in stat() results. Developers are used to various filesystem policies on various platforms that limit when the access time gets updated, if at all. FAT32 filesystems only have an access date, and the driver in Windows fixes the access time at midnight. Updating the access time in NTFS and ReFS can be completely disabled at the system level; otherwise it's updated with a granularity of one hour if it's only the access time that would be updated. The biggest concern for me is NTFS hardlinks, for which the st_size and st_mtime in the directory entry is unreliable. When a file with multiple hardlinks is modified, the filesystem only updates the duplicated information in the directory entry of the opened link. Because the entry in the directory doesn't include the link count or even a boolean value to indicate that a file has multiple hardlinks, if you don't know whether or not there's a possibility of hardlinks, then os.stat() is required in order to reliably determine st_size and st_mtime, to the extent that reliably knowing st_mtime is possible. A general problem that affects even os.stat() is that a modified file may only be noted by setting a flag (FO_FILE_MODIFIED) in the kernel file object of the particular open. Whether it's immediately noted in the last-write time of the shared FCB (file control block) is up to filesystem policy. Starting with Windows 10 1809 (as noted in [MS-FSA]), NTFS immediately notes the modification time, so the st_mtime value from os.stat() is current. In prior versions of NTFS, and with other Microsoft filesystems such as FAT32, the last-write time is only noted when the file is flushed to disk via FlushFileBuffers (i.e. os.fsync) or when the open is closed. This means that st_size may change without also changing st_mtime. I'm using Windows 10 2004 currently, so I can't show an NTFS example, but the following shows the behavior with FAT32: f = open('spam.txt', 'w') st1 = os.stat('spam.txt') time.sleep(10) f.write('spam') f.flush() st2 = os.stat('spam.txt') The above write was noted only by setting the FO_FILE_MODIFIED flag on the kernel file object. (The file object can be inspected with a local kernel debugger.) The write time wasn't noted in the FCB, i.e. st_mtime hasn't changed in st2: >>> st2.st_size - st1.st_size 4 >>> st2.st_mtime - st1.st_mtime 0.0 The last-write time is noted when FlushFileBuffers (os.fsync) is called on the open: >>> os.fsync(f.fileno()) >>> st3 = os.stat('spam.txt') >>> st3.st_mtime - st1.st_mtime 10.0 Note also that, with NTFS, to the extent that the FCB metadata is current, calling os.stat() on a link updates the duplicated information in the directory entry. So calling os.stat() on a NTFS file may update the entry that's returned by a subsequent os.scandir() call. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LEBCSKGSL7PMAFH6AQR5LFL7UJ4T5774/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: os.scandir bug in Windows?
On 10/19/20, Greg Ewing wrote: > On 20/10/20 4:52 am, Gregory P. Smith wrote: >> Those of us with a traditional posix filesystem background may raise >> eyeballs at this duplication, seeing a directory as a place that merely >> maps names to inodes > > This is probably a holdover from MS-DOS, where there was no separate > inode-like structure -- it was all in the directory entry. DOS implemented a find-first/find-next API (int 21h 4E/4F) that provided a file's name, attributes, size, and last write time/date. I think it's clear that the design was influenced by the readily-available contents of a FAT dirent. The Win32 API extended this to FindFirstFile/FindNextFile, with added support for the long filename, create and access times, and, in NT 5+, the reparse tag for a reparse point. NTFS had to support this metadata in the directory index, else FindFirstFile/FindNextFile would be too expensive if the filesystem had to fetch the metadata from the MFT for every matching file in a listing. It tries to keep the duplicated metadata in sync -- such as when a file is open, closed, manually extended in size, when the cache is flushed, or when metadata is explicitly set (e.g. SetFileInformationByHandle: FileBasicInfo). But for performance it doesn't update the duplicated data every time a file is read from or written to. And, in particular, if it's just the access time that changed, it updates the duplicated access time with a one-hour granularity. (There's also a registry value, as I mentioned previously, that disables updating access times completely -- in both the MFT record and the directory index.) That said, if a file has multiple hardlinks the current NTFS implementation for updating duplicated data is totally unreliable. It only updates the accessed link. All other links go stale. We don't have any reasonable way to special case this situation because the directory entry doesn't include the number of links a file has. It has to be opened and queried directly, but then one might as well do a full stat() for every file. I recommend relying on only the high-level is_dir(), is_file(), and is_symlink() methods of os.scandir() items, to quickly process a directory. inode() is reliable -- as much as is possible in Windows -- because the implementation gets the full stat info, but check to ensure it's not 0. It's based on the file ID, which Windows filesystems aren't required to support (or reliably support; it's not stable in FAT). NTFS and ReFS support reliable 64-bit file IDs, and opening by file ID. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/JKK47AWKUOWPPBEAIRGIFRMW6FCPZILG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: os.scandir bug in Windows?
On 10/19/20, Steve Dower wrote: > > Resolving the path is the most expensive part, even if the file is not > opened (I've been working with the NTFS team on this area, and we've > been benchmarking/analysing all of it). If you say it's been extensively benchmarked and there's no direct way around the speed bottleneck, then I take your word for it. To clarify what I had in mind, I was hoping that because NTFS implements the fast I/O function FastIoQueryOpen [1] (via NtfsNetworkOpenCreate, as given by its FastIoDispatch table) that IRP_MJ_CREATE would be bypassed and that the filesystem would not incur a significant cost to parse the remaining path. I figured that most of the work would be in the ObObjectObjectByName and IopParseDevice executive calls that lead up to querying the filesystem. Anyway, it's unfortunate that the Windows API doesn't support NT handle-relative names, except in the registry API. If we could call NTAPI NtQueryAttributesFile [2] directly, then the ObjectAttributes argument could be relative to a directory handle set in the RootDirectory field. That would eliminate the vast majority of the path-resolution cost. A handle-relative open or query goes straight to the filesystem device, which goes straight to the directory that contains the file. To eliminate the cost of opening the directory handle, scandir() could be rewritten to use CreateFileW and GetFileInformationByHandleEx: FileIdBothDirectoryInfo [3] instead of FindFirstFileW / FindNextFileW. Just cache the directory handle in place of caching the find handle. scandir() would gain fd support in Windows. Opening a directory via os.open requires the flag _O_OBTAIN_DIR (0x2000), defined in fcntl.h. FileIdBothDirectoryInfo provides the file ID, so the implementation would support the inode() method without calling stat(). It would still directly support is_dir() and is_file() based on the file attributes, and is_symlink() based on the file attributes and the EaSize field. The Windows Protocols document that the latter contains the reparse tag for a reparse point. The field is reused because a reparse point can't have extended attributes. All that said, I don't prefer to call NtQueryAttributesFile or any other NTAPI function in Windows Python. I'd rather do the best possible with just the Windows API. I wish there were a new GetFileAttributesExExW function that supported handle-relative names. Even better would be a new function that calls NtQueryInformationByName -- something like GetFileInformationByName -- for FileStatInfo (and FileCaseSensitiveInfo as well, which is becoming more of an issue), also with support for handle-relative names. [1] https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/ns-wdm-_fast_io_dispatch [2] https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-zwqueryfullattributesfile [3] https://docs.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_id_both_dir_info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GODUIB5WKVZLX4BVPEM2NS37JFHUXIID/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: os.scandir bug in Windows?
On 10/19/20, Steve Dower wrote: > On 19Oct2020 1242, Steve Dower wrote: >> On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote: >>> TLDR: In os.scandir directory entries, atime is always a copy of mtime >>> rather than the actual access time. >> >> Correction - os.stat() updates the access time to _now_, while >> os.scandir() returns the last access time without updating it. > > Let me correct myself first :) > > *Windows* has decided not to update file access time metadata *in > directory entries* on reads. os.stat() always[1] looks at the file entry > metadata, while os.scandir() always looks at the directory entry metadata. > > My suggested approach still applies, other than the bit where we might > fix os.stat(). The best we can do is regress os.scandir() to have > similarly poor performance, but the best *you* can do is use os.stat() > for accurate timings when files might be being modified while your > program is running, and don't do it when you just need names/kinds (and > I'm okay adding that note to the docs). If this is the correction to which you're referring in the previous message, I assumed you stood by the claim that os.stat() may update st_atime. That shouldn't be the case, so there shouldn't be anything that needs to be fixed there, unless I'm missing what you think needs to be fixed. If it's actually a problem, then I'd really, really like a test case that reproduces it. If it was just a misinterpreted test case or mis-remembered fact, then that's good news for me. ;-) Regarding updating the access time in the directory entry, in my previous reply I explained that NTFS should update it with a one-hour granularity. With FAT, it's daily. Regarding the view that this is only about "accurate timings when files might be being modified while your program is running", in my previous messages I stressed that the directory entry for a hard link may have the wrong size, change time, write time, and access time if it wasn't the last link used to update the file. That has nothing to do with the file being modified while the program is running. It's a stale directory entry. If you call os.stat() on the stale link, NTFS will update it with the correct metadata. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SUGIZ6OAXOD37USVBWAW7JRSUDBSMG7Q/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: os.scandir bug in Windows?
On 10/19/20, Steve Dower wrote: > On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote: >> TLDR: In os.scandir directory entries, atime is always a copy of mtime >> rather than the actual access time. > > Correction - os.stat() updates the access time to _now_, while > os.scandir() returns the last access time without updating it. os.stat() shouldn't affect st_atime because it doesn't access the file data. That has me curious if it can be reproduced. With NTFS in Windows 10, I'd expect the os.stat() st_atime to change immediately when the file data is read or modified. With other filesystems, it may not be updated until the kernel file object that was used to access the file's data is closed. Note that updating the access time in NTFS can be disabled by the "NtfsDisableLastAccessUpdate" value in "HKLM\System\CurrentControlSet\Control\FileSystem". The default value in Windows 10 should be 0x8002, which means the value is system managed and updating the access time is enabled. If it's only the access time that changes, the directory entry may be updated with a significant granularity such as hourly or daily. For NTFS, it's hourly. To confirm this, wait an hour from the current access time in the directory entry; open the file; read some data; and close the file. The access time in the directory entry should be updated. For details, download the [MS-FSA] PDF [1] and look for all references to the following sections: * 2.1.4.17 Algorithm for Noting That a File Has Been Modified * 2.1.4.19 Algorithm for Noting That a File Has Been Accessed * 2.1.4.18 Algorithm for Updating Duplicated Information Also check the tables in Appendix A, which provide the update granularity of file time stamps (presumably for directory entries) for common Windows filesystems. [1] https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fsa/860b1516-c452-47b4-bdbc-625d344e2041 Going back to my initial message, I can't stress enough that this problem is at its worst when a file has multiple hardlinks. If a particular link in a directory wasn't the last link used to access the file, its duplicated metadata may have the wrong file size, access time, modify time, and change time (the latter is not reported by Python). As is, for the current implementation, I'd only rely on the basic attributes such as whether it's a directory or reparse point (symlink, mountpoint, etc) when using scandir() to quickly process a directory. For reliable stat information, call os.stat(). I do think, however, that os.scandir() can be improved in Windows without significant performance loss if it calls GetFileAttributesExW to get st_file_attributes, st_size, st_ctime (create time), st_mtime, and st_atime. This API call is relatively fast because it doesn't require opening the file via CreateFileW, which is one of the more expensive operations in os.stat(). But I haven't tried modifying scandir() to benchmark it. Ultimately, I'm waiting for Windows 10 to provide a WinAPI function that calls the relatively new NTAPI function NtQueryInformationByName [2] (by name, not by handle!) to get the FileStatInformation, as well as for this information to be made available by handle via GetFileInformationByHandleEx. Compared to GetFileAttributesExW, the FileStatInformation additionally provides the file ID (if implemented by the filesystem), change time, reparse tag, number of links, and the effective access of the security context of the caller (i.e. process or thread access token). The latter is something that we've never impemented with os.stat(). It's not the same as POSIX owner-group-other permissions. It would need a new attribute such as st_effective_access. It could be used to provide a real implementation of os.access() in Windows. https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-ntqueryinformationbyname ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NPP6GKAEI7SOVA45WTJ222YVEALTF6WO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: os.scandir bug in Windows?
On 10/15/20, Rob Cliffe via Python-Dev wrote: > > TLDR: In os.scandir directory entries, atime is always a copy of mtime > rather than the actual access time. There are inconsistencies in various scenarios between between the stat info from the directory entry and the stat info from the File Control Block (FCB) -- the filesystem's in-memory record that's common to all opens for a file/directory. The worst case is for an NTFS file with multiple hardlinks, for which the directory entry information is from the last time the file was opened using a particular hardlink. The accurate NTFS file information is in the file's Master File Table (MFT) record, which gets accessed to populate the FCB and update the particular link when a file is opened. If you're looking for file times and file size, the only reliable information comes from directly opening the file an querying the info via GetFileInformationByHandle (called by os.stat), GetFileInformationByHandleEx (FileBasicInfo, FileStandardInfo), GetFileTime, and GetFileSizeEx. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IJIFZHPEEMVPD2LN6H3MY4KGRKNQ4TBQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: os.add_dll_directory and DLL search order
On 6/22/20, Steve Dower wrote: > > What is likely happening here is that _sqlite3.pyd is being imported > before _mapscript, and so there is already a SQLITE3 module in memory. > Like Python, Windows will not attempt to import a second module with the > same name, but will return the original one. Qualified DLL loads won't interfere with each other, but dependent DLLs are loaded by base name only. In these cases a SxS assembly allows loading multiple DLLs that have the same base name. If the assembly is referenced by a DLL, embed the manifest in the DLL as resource 2. For example: >>> import ctypes >>> test1 = ctypes.CDLL('./test1') >>> test2 = ctypes.CDLL('./test2') >>> test1.call_spam.restype = None >>> test2.call_spam.restype = None >>> test1.call_spam() spam v1.0 >>> test2.call_spam() spam v2.0 >>> import win32process, win32api >>> names = [win32api.GetModuleFileName(x) ... for x in win32process.EnumProcessModules(-1)] >>> spams = [x for x in names if 'spam' in x] >>> print(*spams, sep='\n') C:\Temp\test\c\spam.dll C:\Temp\test\c\spam_assembly\spam.dll Source spam1.c (spam.dll): #include void __declspec(dllexport) spam() { printf("spam v1.0\n"); } test1.c (test1.dll): #pragma comment(lib, "spam") void __declspec(dllimport) spam(); void __declspec(dllexport) call_spam() { spam(); } --- spam_assembly/spam_assembly.manifest: spam2.c (spam_assembly/spam.dll): #include void __declspec(dllexport) spam() { printf("spam v2.0\n"); } test2.c (test2.dll -- link with /manifest:embed,id=2): #pragma comment(lib, "spam") #pragma comment(linker, "/manifestdependency:\"\ type='win32' \ name='spam_assembly' \ version='2.0.0.0' \ processorArchitecture='amd64' \"") void __declspec(dllimport) spam(); void __declspec(dllexport) call_spam() { spam(); } ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5PDVL7KOBCCIVRSYQH4WXHBCZ23KYKG3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: What to do about invalid escape sequences
On 8/10/19, Rob Cliffe via Python-Dev wrote: > On 10/08/2019 11:50:35, eryk sun wrote: >> On 8/9/19, Steven D'Aprano wrote: >>> I'm also curious why the string needs to *end* with a backslash. Both of >>> these are the same path: >>> >>> C:\foo\bar\baz\ >>> C:\foo\bar\baz > > Also, the former is simply more *informative* - it tells the reader that > baz is expected to be a directory, not a file. This is an important point that I overlooked. The trailing backslash is more than just a redundant character to inform human readers. Refer to [MS-FSA] 2.1.5.1 "Server Requests an Open of a File" [1]. A create/open fails with STATUS_OBJECT_NAME_INVALID if either of the following is true: * PathName contains a trailing backslash and CreateOptions.FILE_NON_DIRECTORY_FILE is TRUE. * PathName contains a trailing backslash and StreamTypeToOpen is DataStream For NtCreateFile or NtOpenFile (in the NT API), the FILE_NON_DIRECTORY_FILE option restricts the call to a regular file, and FILE_DIRECTORY_FILE restricts it to a directory. With neither option, the call can target either a file or directory. A trailing backslash is another information channel. It tells the filesystem that the target has to be a directory. If we specify FILE_NON_DIRECTORY_FILE with a trailing backslash on the name, this is an immediate failure as an invalid name without even checking the entry. If we specify neither option and use a trailing backslash, it's an invalid name if the filesystem finds a regular file or data stream. Had the call specified the FILE_DIRECTORY_FILE option, it would instead fail with STATUS_NOT_A_DIRECTORY. We can see this in practice in the published source for the fastfat filesystem driver. FatCommonCreate [2] (for a create or open) has the following code to handle the second case (in this code, an FCB is a file control block for a regular file, and a DCB is a directory control block): if (NodeType(Fcb) == FAT_NTC_FCB) { // // Check if we were only to open a directory // if (OpenDirectory) { DebugTrace(0, Dbg, "Cannot open file as directory\n", 0); try_return( Iosb.Status = STATUS_NOT_A_DIRECTORY ); } DebugTrace(0, Dbg, "Open existing fcb, Fcb = %p\n", Fcb); if ( TrailingBackslash ) { try_return( Iosb.Status = STATUS_OBJECT_NAME_INVALID ); } We observe the first case with a typical CreateFileW call, which uses the option FILE_NON_DIRECTORY_FILE. In the following example "baz" is a regular file: >>> f = open(r'foo\bar\baz') # success >>> try: open('foo\\bar\\baz\\') ... except OSError as e: print(e) ... [Errno 22] Invalid argument: 'foo\\bar\\baz\\' C EINVAL (22) is mapped from Windows ERROR_INVALID_NAME (123), which is mapped from NT STATUS_OBJECT_NAME_INVALID (0xC033). We can observe the second case with os.stat(), which calls CreateFileW with backup semantics, which omits the FILE_NON_DIRECTORY_FILE option in order to allow the call to open either a file or directory. In this case the filesystem has to actually check that "baz" is a data file before it can fail the call, as was shown in the fasfat code snippet above: >>> try: os.stat('foo\\bar\\baz\\') ... except OSError as e: print(e) ... [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'foo\\bar\\baz\\' [1] https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fsa/8ada5fbe-db4e-49fd-aef6-20d54b748e40 [2] https://github.com/microsoft/Windows-driver-samples/blob/74200/filesys/fastfat/create.c#L1398 ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QPDXUY4OXR2XOCNUHSKC7QRQGAXWV5WQ/
[Python-Dev] Re: What to do about invalid escape sequences
On 8/10/19, eryk sun wrote: > > The per-logon directory is located at "\\Sessions\\0\\DosDevices\\ Session ID>". In the Windows API, it's accessible as "//?/" or "//./", > or with any mix of forward slashes or backslashes, but only the > all-backslash form is special-cased to bypass the normalization step. Correction: I slipped up in that last sentence. Only the all-backslash form that's in the "?" namespace bypasses normalization, as most Windows users should at least have seen in passing. These special device paths pop up here and there. For example, r'\\?\C:\Temp\spam. . .' allows creating or opening a file named "spam. . .", which the Windows API would normalize as "spam". But I don't recommend sidestepping the normal rules -- except for the path length limit because there are ways to make long paths conveniently accessible (e.g. symbolic links, bind-like mountpoints, and subst drives). Sometimes people also come across "\\??\\" paths and come to the mistaken conclusion that these can be used in Windows API programs. No, they're for NT. The runtime library mangles them, e.g. nt._getfullpathname(r'\??\C:') == 'C:\\??\\C:'. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VANNT2SIH7EBPEOUC6M7HI7PYASJPYC7/
[Python-Dev] Re: What to do about invalid escape sequences
On 8/9/19, Steven D'Aprano wrote: > > I'm also curious why the string needs to *end* with a backslash. Both of > these are the same path: > > C:\foo\bar\baz\ > C:\foo\bar\baz The above two cases are equivalent. But that's not the case for the root directory. Unlike Unix, filesystem namespaces are implemented directly on devices. For example, "//./C:" might resolve to a volume device such as "\\Device\\HarddiskVolume2". With a trailing slash added, "//./C:/" resolves to "\\Device\\HarddiskVolume2\\", which is the root directory of the mounted filesystem on the volume. Also, as a classic DOS path, "C:" without a trailing slash expands to the working directory on drive "C:". The system runtime library looks for this path in a hidden environment variable named "=C:". The Windows API never sets these hidden "=X:" drive variables. The C runtime sets them, as does Python's os.chdir. Some volume-management functions require a trailing slash or backslash, such as GetVolumeInformationW [1]. GetVolumeNameForVolumeMountPointW [2] actually requires it to be a trailing backslash. It will not accept a trailing forward slash such as "C:\\Mount\\Volume/" (a bug since Windows 2000). The volume name (e.g. "?\\Volume{----}\\") returned by the latter includes a trailing backslash, which must be present in the target path in order for a mountpoint to function properly as a directory, else it would resolve to the volume device instead of the root directory. [1] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getvolumeinformationw [2] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getvolumenameforvolumemountpointw > If they're Windows developers, they ought to be aware that the Windows > file system API allows / anywhere you can use \ and it is the > common convention in Python to use forward slashes. The Windows file API actually does not allow slash to be used anywhere that we can use backslash. It's usually allowed, but not always. For the most part, the conditions where forward slash is not supported are intentional. Windows replaces forward slash with backslash in normal DOS paths and normal device paths. But sometimes we have to use a special form of device path that bypasses normalization. A path that isn't normalized can only use backslash as the path separator. For example, the most common case is that the process doesn't have long paths enabled. In this case we're limited to MAX_PATH, which limits file paths to a paltry 259 characters (sans the terminating null); the current directory to 258 characters (sans a trailing backslash and null); and the path of a new directory to 247 characters (subtract 12 from 259 to leave space for an 8.3 filename). By skipping DOS normalization, we can access a path with up to about 32,750 characters (i.e. 32,767 sans the length of the device name in the final NT path under "\\Device\\"). (Long normalized paths are available starting in Windows 10, but the system policy that allows this is disabled by default, and even if enabled, each application has to declare itself to be long-path aware in its manifest. This is declared for python[w].exe in Python 3.6+.) A device path is an explicit reference to a user's local device directory (in the object namespace), which shadows the global device directory. In NT, this directory is aliased to a special "\\??\\" prefix (backslash only). A local device directory is created for each logon session (not terminal session) by the security system that runs in terminal session 0 (i.e. the system services session). The per-logon directory is located at "\\Sessions\\0\\DosDevices\\". In the Windows API, it's accessible as "//?/" or "//./", or with any mix of forward slashes or backslashes, but only the all-backslash form is special-cased to bypass the normalization step. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3SDFM2EKFO3UNTATS7KVBY2WOUTFMAF5/
[Python-Dev] Re: What to do about invalid escape sequences
On 8/7/19, Steve Dower wrote: > > * change the PyErr_SetExcFromWindowsErrWithFilenameObjects function to > append (or chain) an extra message when either of the filenames contains c > control characters (or change OSError to do it, or the default > sys.excepthook) On a related note for Windows, if the error is specifically ERROR_INVALID_NAME, we could extend this to look for and warn about the five reserved wildcard characters (asterisk, question mark, double quote, less than, greater than), pipe, and colon. It's only sometimes the case for colon because it's allowed in device names and used as the name and type delimiter for stream names. Kernel object names don't reserve wildcard characters, pipe, and colon. So I wouldn't want anything but the control-character warning if it's say ERROR_FILE_NOT_FOUND. An example would be SharedMemory(name='Global\test'), or a similar error for registry key and value names such as OpenKey(hkey, 'spam\test'), that is if winreg were updated to include the name in the exception. Note that forward slash is just a name character in these cases, not a path separator, so we have to use backslash, even if just via replace('/', '\\'). ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UFMVFL4QDUXLZFBWVW4YLAKPHQ6LTPDK/
[Python-Dev] Re: What to do about invalid escape sequences
On 8/5/19, Steve Dower wrote: > > though I do also see many people bitten by FileNotFoundError > because of a '\n' in their filename. Thankfully the common filesystems used in Windows reserve ASCII control characters in filenames (except not in stream names or named-pipe names). So a mistaken string literal usually fails with a more obvious ERROR_INVALID_NAME or C EINVAL instead of a mysterious ERROR_FILE_NOT_FOUND or C ENOENT. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HTOH6MOHYIDD2UX7YSM2ZVY4BP32ATYL/
Re: [Python-Dev] Remove tempfile.mktemp()
On 3/23/19, Cameron Simpson wrote: > > Also, the common examples are attackers who are not the user making the > tempfile, in which case the _default_ mktemp is sort of secure with the > above because it gets made in /tmp which on a modern POSIX system > prevents _other_ uses from removing/renaming a file. (And Eryk I think > described the Windows situation which is similarly protected). Using NamedTemporaryFile(delete=False) or mkstemp() ensures that the file is created and opened securely. in contrast, the filename from mktemp() might be used naively in POSIX, such as open(path, "w"). This file might grant read access to everyone depending on the file-mode creation mask (umask). Also, since it neglects to use exclusive mode ("x"), it might open an existing file that grants read-write permission to the world, or maybe it's a symlink. By default, even naive use of the mktemp() name in Windows remains secure, since every user has a separate temp directory that's only accessible by privileged users such as SYSTEM, Administrators, and Backup Operators (with SeBackupPrivilege and SeRestorePrivilege enabled). The primary issue with a short name is an accidental name collision with another program that's not as careful as Python's tempfile. Using a longer name decreases the chance of this to practically nothing. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove tempfile.mktemp()
On 3/20/19, Greg Ewing wrote: > Antoine Pitrou wrote: > >> How is it more secure than using mktemp()? > > It's not, but it solves the problem someone suggested of another > program not being able to access and/or delete the file. NamedTemporaryFile(delete=False) is more secure than naive use of mktemp(). The file is created exclusively (O_EXCL). Another standard user can't overwrite it. Nor can another standard user delete it if it's created in the default temp directory (e.g. POSIX "/tmp" has the sticky bit set). mkstemp() is similar but lacks the convenience and reliable resource management of a Python file wrapper. There's still the problem of accidental name collisions with other processes that can access the file, i.e. processes running as the same user or, in POSIX, processes running as the super user. I saw a suggestion in this thread to increase the length of the random sequence from 8 characters up to 22 characters in order to make this problem extremely improbable. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove tempfile.mktemp()
On 3/20/19, Anders Munch wrote: > > You are right, I must have mentally reversed the polarity of the delete > argument. And I didn't realise that the access right on a file had the > power to prevent itself from being removed from the folder that it's in. I > thought the access flags were a property of the file itself and not the > directory entry. Not sure how that works. In POSIX, it's secure so long as we use a directory that doesn't grant write access to other users, or one that has the sticky bit set such as "/tmp". A directory that has the sticky bit set allows only root and the owner of the file to unlink the file. In Windows, a user's default %TEMP% directory is only accessible by the user, SYSTEM, and Administrators. The only way others can delete a file there is if the file security is modified to allow it (possible for individual files, unlike POSIX). This works even with no access to the temp directory itself because users have SeChangeNotifyPrivilege, which bypasses traverse (execute) access checks. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove tempfile.mktemp()
On 3/19/19, Victor Stinner wrote: > > When I write tests, I don't really care of security, but > NamedTemporaryFile caused me many troubles on Windows: you cannot > delete a file if it's still open in a another program. It's way more > convenient to use tempfile.mktemp(). Opening the file again for normal access is problematic. NamedTemporaryFile opens it with delete access, but Python's open() function doesn't support delete-access sharing unless an opener is used that calls CreateFileW. NamedTemporaryFile does open files with delete-access sharing, so any process can delete the file if it's allowed by the file's security and attributes. You may be thinking of unlinking. In Windows versions prior to 10, that's always a two-step process. A file with its delete disposition set doesn't get unlinked until all references for all open instances are closed. In Windows 10 (release 1709+), we have the option of using SetFileInformationByHandle: FileDispositionInfoEx (21) with FILE_DISPOSITION_FLAG_POSIX_SEMANTICS (2) and FILE_DISPOSITION_FLAG_DELETE (1). The online documentation hasn't been updated to include this, but it's supported in the headers for _WIN32_WINNT_WIN10_RS1 and later. This operation unlinks the file as soon as we close our handle, even if it has existing references. This is explained in the remarks for the underlying NT system call [1]. In particular this resolves the race condition related to handles opened by anti-malware programs. It may be worth adding support for deleting files by handle that tries FileDispositionInfoEx in 1709+. This will work in about half of all Windows systems. (About 40% still run Windows 7.) It's not a panacea for Windows file-deleting woes. We still need to be able to open the file with delete access, which requires existing opens to share delete access. [1]: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntddk/ns-ntddk-_file_disposition_information_ex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding test.support.safe_rmpath()
On 2/16/19, Richard Levasseur wrote: > > First: The tempfile module is a poor fit for testing (don't get me wrong, > it works, but its not *nice for use in tests*)*.* This is because: > 1. Using it as a context manager is distracting. The indentation signifies > a conceptual scope the reader needs to be aware of, but in a test context, > its usually not useful. At worst, it covers most of the test. At best, its > constrained to a block at the start. > 2. tempfile defaults to binary mode instead of text; just another thing to > bite you. > 3. On windows, you can't reopen the file, so for cross-platform stuff, you > can't even use it for this case. Python opens files with at least read and write sharing in Windows, so typically there's no problem with opening a file multiple times. The problem is with deleting and renaming open files. Typically delete access is not shared, and, even if it is, a normal delete just sets a disposition. A deleted file is unlinked only after all handles have been closed. Similarly, replacing an open file via os.replace will fail because it can't be unlinked. In Windows 10 we can delete and rename files with POSIX-like semantics. To do this, open a handle with delete access and call SetFileInformationByHandle to set the FileDispositionInfoEx or FileRenameInfoEx information. Thus far this is supported by NTFS, and I think it's only NTFS. It's still not completely like POSIX, since it requires delete-access sharing. But it does provide immediate unlinking, which avoids the race condition when trying to remove a directory that has watched files. Programs that have open files that have been unlinked can continue to access them normally. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ctypes: is it intentional that id() is the only way to get the address of an object?
On 1/18/19, Steven D'Aprano wrote: > On Thu, Jan 17, 2019 at 07:50:51AM -0600, eryk sun wrote: >> >> It's kind of dangerous to pass an object to C without an increment of >> its reference count. > > "Kind of dangerous?" How dangerous? I take that back. Dangerous is too strong of a word. It can be managed if we're careful to avoid expressions like c_function(id(f())). Using py_object simply avoids that problem. Bear with me while I make a few more comments about py_object, even though it's straying off topic. For a type "O" argument (i.e. py_object is in the function's `argtypes`), we might be able to borrow the reference from the argument tuple. As implemented, however, the argument actually keeps its own reference. For example, we can observe this by calling the from_param method: >>> b = bytearray(b'spam') >>> arg = ctypes.py_object.from_param(b) >>> print(arg) >>> print(arg._obj) bytearray(b'spam') This is due to the type "O" setfunc, which needs to keep a reference to the object when setting the value of a py_object instance. The reference is stored as the _objects attribute. (For non-simple pointer and aggregate types, _objects is instead a dict keyed by the index as a hexadecimal string.) (The getfunc and setfunc of a simple ctypes object are called to get and set the value, which also includes cases in which we don't have an actual py_object instance, such as function call arguments; pointer and array indexes; and struct and union fields. These functions are defined in Modules/_ctypes/cfield.c.) IMO, a downside of py_object is that it's a simple type, so the getfunc gets called automatically when getting fields or indexes. This is annoying for py_object since a NULL value raises ValueError. Returning None in this case isn't possible, in contrast to other simple pointer types. We can work around this by subclassing py_object. For example: >>> a1 = (ctypes.py_object * 1)() >>> a1[0] Traceback (most recent call last): File "", line 1, in ValueError: PyObject is NULL py_object = type('py_object', (ctypes.py_object,), {}) >>> a2 = (py_object * 1)() >>> a2[0] Then, like all ctypes pointers, a false boolean value means it's NULL: >>> bool(a2[0]) False >>> a2[0] = b'spam' >>> bool(a2[0]) True py_object doesn't help if a library holds onto the pointer and tries to use it later on. For example, with Python's C API there are functions that 'steal' a reference (with the assumption that it's a newly created object, in which case it's more like 'claiming'), such as PyTuple_SetItem. In this case, we need to increment the reference count via Py_IncRef. py_object can be returned from a callback without leaking a reference, assuming the library manages the new reference. In contrast, other types that need memory support have to leak a reference (e.g. c_wchar_p, i.e. type "Z", needs a capsule object for the wchar_t buffer). In case of a leak, we get warned with RuntimeWarning('memory leak in callback function.'). > If I am reading this correctly, I think you are saying that using id() > in this way is never(?) correct. Yes, it's incorrect, but I've been guilty of using id() like this, too, because it's convenient. Perhaps we could provide a function that's explicitly specified to return the address, if implemented. Maybe call it sys.getaddress()? In my first reply, I provided two alternatives that use ctypes to return the address instead of id(). So there's that as well. The fine print is that ctypes is optional in the standard library. Platforms and implementations don't have to support it. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ctypes: is it intentional that id() is the only way to get the address of an object?
On 1/17/19, Steven D'Aprano wrote: > > I understand that the only way to pass the address of an object to > ctypes is to use that id. Is that intentional? It's kind of dangerous to pass an object to C without an increment of its reference count. The proper way is to use a simple pointer of type "O" (object), which is already created for you as the "py_object" type. >>> ctypes.py_object._type_ 'O' >>> ctypes.py_object.__bases__ (,) It keeps a reference in the readonly _objects attribute. For example: >>> b = bytearray(b'spam') >>> sys.getrefcount(b) 2 >>> cb = ctypes.py_object(b) >>> sys.getrefcount(b) 3 >>> cb._objects bytearray(b'spam') >>> del cb >>> sys.getrefcount(b) 2 If you need the address without relying on id(), cast to a void pointer: >>> ctypes.POINTER(ctypes.c_void_p)(cb)[0] == id(b) True Or instantiate a c_void_p from the py_object as a buffer: >>> ctypes.c_void_p.from_buffer(cb).value == id(b) True Note that ctypes.cast() doesn't work in this case. It's implemented as an FFI function that takes the object address as a void pointer. The from_param method of c_void_p doesn't support py_object: >>> ctypes.c_void_p.from_param(cb) Traceback (most recent call last): File "", line 1, in TypeError: wrong type ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Questions about signal handling.
On Fri, Sep 21, 2018 at 6:10 PM, Victor Stinner wrote: > > Moreover, you can get the signal while you don't hold the GIL :-) Note that, in Windows, SIGINT and SIGBREAK are implemented in the C runtime and linked to the corresponding console control events in a console application, such as python.exe. Console control events are delivered on a new thread (i.e. no Python thread state) that starts at CtrlRoutine in kernelbase.dll. The session server (csrss.exe) creates this thread remotely upon request from the console host process (conhost.exe). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tests failing on Windows with TESTFN
On Sun, Jul 29, 2018 at 2:21 PM, Jeremy Kloth wrote: > > try: > os.rename(new_file.name, self._path) > except FileExistsError: > -os.remove(self._path) > +temp_name = _create_temporary_name(self._path) > +os.rename(self._path, temp_name) > os.rename(new_file.name, self._path) > +os.remove(temp_name) This should call os.replace to allow the file system to replace the existing file. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tests failing on Windows with TESTFN
On Sun, Jul 29, 2018 at 12:35 PM, Steve Dower wrote: > > One additional thing that may help (if support.unlink doesn't already do it) > is to rename the file before deleting it. Renames are always possible even > with open handles, and then you can create a new file at the original name. Renaming open files typically fails with a sharing violation (32). Most programs open files with read and write sharing but not delete sharing. This applies to Python, except temporary files (i.e. os.O_TEMPORARY) do share delete access. Renaming a file is effectively adding a new link and deleting the old link, so it requires opening the file with delete access. Also, renaming a directory that has open files in the tree fails with access denied (5). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tests failing on Windows with TESTFN
On Sun, Jul 29, 2018 at 9:13 AM, Tim Golden wrote: > > For an example: > > http://tjg.org.uk/test.log > > Thinkpad T420, 4Gb, i5, SSD > > Recently rebuilt and reinstalled: Win10, VS2017, TortoiseGit, standard > Windows Antimalware, usual developer tools. That particular run was done > with the laptop unattended (ie nothing else going on at the front end). > But the problem is certainly not specific to this laptop. On my last run I had one test directory that wasn't removed properly, but nothing like the flood of EACCES and ERROR_ACCES_DENIED errors you have in that log. Then again, I had Defender disabled by policy. I'll enable it and add exceptions for my source and build directories, and see how it goes. It would be nice if OSError instances always captured the last Windows error and NT status values when instantiated. We have no guarantees that these values are valid, but in many contexts they are. In the case of a test log, it would certainly help to clarify errors without having to individually investigate each one. For example, trying to open a directory as a file is a common error, but all Python tells us on Windows is that it failed with EACCES. In this case the last Windows error is ERROR_ACCESS_DENIED, which doesn't help, but the last NT status code is STATUS_FILE_IS_A_DIRECTORY (0xc0ba). Here's a file opener that adds last_winerror and last_ntstatus values. import os ntdll = ctypes.WinDLL('ntdll') kernel32 = ctypes.WinDLL('kernel32') def nt_opener(path, flags): try: return os.open(path, flags) except OSError as e: last_ntstatus = ntdll.RtlGetLastNtStatus() last_winerror = kernel32.GetLastError() e.last_ntstatus = last_ntstatus & 2**32 - 1 e.last_winerror = (last_winerror if e.winerror is None else e.winerror) if e.errno is not None or e.winerror is not None: # hack the last error/status into the error message e.strerror = '[Last NtStatus {:#08x}] {}'.format( e.last_ntstatus, e.strerror or '') if e.winerror is None: e.strerror = '[Last WinError {}] {}'.format( e.last_winerror, e.strerror or '') raise e from None Opening a directory as a file: >>> open('C:/Windows', opener=nt_opener) Traceback (most recent call last): File "", line 1, in File "", line 17, in nt_opener File "", line 3, in nt_opener PermissionError: [Errno 13] [Last WinError 5] [Last NtStatus 0xc0ba] Permission denied: 'C:/Windows' ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tests failing on Windows with TESTFN
On Sat, Jul 28, 2018 at 9:17 PM, Jeremy Kloth wrote: > > *PLEASE*, don't use tempfile to create files/directories in tests. It > is unfriendly to (Windows) buildbots. The current approach of > directory-per-process ensures no test turds are left behind, whereas > the tempfile solution slowly fills up my buildbot. Windows doesn't > natively clean out the temp directory. FYI, Windows 10 storage sense (under system->storage) can be configured to delete temporary files on a schedule. Of course that doesn't help with older systems. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tests failing on Windows with TESTFN
On Sat, Jul 28, 2018 at 5:20 PM, Tim Golden wrote: > > I've got a mixture of Permission (winerror 13) & Access errors (winerror 5) EACCES (13) is a CRT errno value. Python raises PermissionError for EACCES and EPERM (1, not used). It also does the reverse mapping for WinAPI calls, so PermissionError is raised either way. About 25 WinAPI error codes map to EACCES. Commonly it's due to either ERROR_ACCESS_DENIED (5) or ERROR_SHARING_VIOLATION (32). open() uses read-write sharing but not delete sharing. In this case trying to either delete an already open file or open a file that's already open with delete access (e.g. an O_TEMPORARY open) both fail with a sharing violation. An access-denied error could be due to a range of causes. Over 20 NTAPI status codes map to ERROR_ACCESS_DENIED. Commonly for a file it's due to one of the following status codes: STATUS_ACCESS_DENIED (0xc022) The file security doesn't grant the requested access to the caller. STATUS_DELETE_PENDING (0xc056) The file's delete disposition is set, i.e. it's flagged to be deleted when the last handle is closed. Opening a new handle is disallowed for any access. STATUS_FILE_IS_A_DIRECTORY (0xc0ba) Except when using backup semantics, CreateFile calls NtCreateFile with the flag FILE_NON_DIRECTORY_FILE, so only non-directory files/devices can be opened. STATUS_CANNOT_DELETE (0xc121) The file is either readonly or memory mapped. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Windows] how to prevent the wrong version of zlib1.dll to be used by lib-dynload modules
On Mon, Jul 23, 2018 at 2:31 PM, Eric Le Lay wrote: > > I encountered a problem with the Windows packaging of gPodder[1] > using msys2: Are you using regular Windows Python with msys2, or their custom port? I installed msys2 and used pacman to install Python 3.6. The msys2 environment names libraries with an "msys-" prefix in the "/usr/bin" directory, such as msys-python3.6m.dll, msys-readline7.dll, and msys-z.dll (zlib). This is also the application directory of the msys2 build of Python (i.e. "/usr/bin/python.exe"), so it's the first directory in the default DLL search path (ahead of system directories and PATH). Unlike Windows Python, msys2 Python does not use the alternate search path that replaces the application directory with the DLL directory in the search path. A way to implement this that allows multiple versions of a DLL to be loaded in the same process is to use an assembly that includes the DLL file in its ".manifest" file. Add the assembly to the extension module's #2 manifest (typically embedded, but can be ".2"). The system looks for the "" subdirectory in the module directory. In Windows 7+ you can also add a probing path in a config file (i.e. ".config") [1] that extends the SxS search path with up to 9 relative paths, which can be up to two levels above the module directory (i.e. "..\.."). [1]: https://docs.microsoft.com/en-us/windows/desktop/SbsCs/application-configuration-files ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subprocess not escaping "^" on Windows
On Mon, Jan 8, 2018 at 9:26 PM, Steve Dower <steve.do...@python.org> wrote: > On 09Jan2018 0744, eryk sun wrote: >> >> It's common to discourage using `shell=True` because it's considered >> insecure. One of the reasons to use CMD in Windows is that it tries >> ShellExecuteEx if CreateProcess fails. ShellExecuteEx supports "App >> Paths" commands, file actions (open, edit, print), UAC elevation (via >> "runas" or if requested by the manifest), protocols (including >> "shell:"), and opening folders in Explorer. It isn't a scripting >> language, however, so it doesn't pose the same risk as using CMD. >> Calling ShellExecuteEx could be integrated in subprocess as a new >> Popen parameter, such as `winshell` or `shellex`. > > This can also be used directly as os.startfile, the only downside being that > you can't wait for the process to complete (but that's due to the underlying > API, which may not end up starting a process but rather sending a message to > an existing long-running one such as explorer.exe). I'd certainly recommend > it for actions like "open this file with its default editor" or "browse to > this web page with the default browser". Yes, I forgot to mention that os.startfile can work sometimes. But often one needs to pass command-line parameters. Also, os.startfile also can't set a different working directory, nShow SW_* window state, or flags such as SEE_MASK_NO_CONSOLE (prevent allocating a new console). Rather than extend os.startfile, it seems more useful in general to wrap ShellExecuteEx in _winapi and extend subprocess. Then os.startfile can be reimplemented in terms of subprocess.Popen, like os.popen. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subprocess not escaping "^" on Windows
On Sun, Jan 7, 2018 at 6:48 PM, Christian Tismerwrote: > That is true. > list2cmdline escapes partially, but on NT and Windows10, the "^" must > also be escaped, but is not. The "|" pipe symbol must also be escaped > by "^", as many others as well. > > The effect was that passing a rexexp as parameter to a windows program > gave me strange effects, and I recognized that "^" was missing. > > So I was asking for a coherent solution: > Escape things completely or omit "shell=True". > > Yes, there is a list of chars to escape, and it is Windows version > dependent. I can provide it if it makes sense. subprocess.list2cmdline is meant to help support cross-platform code, since Windows uses a command-line instead of an argv array. The command-line parsing rules used by VC++ (and CommandLineToArgvW) are the most common in practice. list2cmdline is intended for this set of applications. Otherwise pass args as a string instead of a list. In CMD we can quote part of a command line in double quotes to escape special characters. The quotes are preserved in the application command line. This can get complicated when we need to preserve literal quotes in the command line of an application that uses VC++ backslash escaping. CMD doesn't recognize backslash as an escape character, which gives rise to a quoting conflict between CMD and the application. Some applications support translating single quotes to double quotes in this case (e.g. schtasks.exe). Single quotes generally aren't used in CMD, except in a `for /f` loop, but this can be forced to use backquotes instead via `usebackq`. Quoting doesn't escape the percent character that's used for environment variables. In batch scripts percent can be escaped by doubling it, but not in /c commands. Some applications can translate a substitute character in this case, such as "~" (e.g. setx.exe). Otherwise, we can usually disrupt matching an existing variable by adding a "^" character after the first percent character. The "^" escape character gets consumed later on in parsing -- as long as it's not quoted (see the previous paragraph for complications). Nonetheless, "^" is a valid name character, so there's still a possibility of matching an environment variable (perhaps a malicious one). For example: C:\>python -c "print('"%^"time%')" %time% C:\>set "^"time=spam" C:\>python -c "print('"%^"time%')" spam Anyway, we're supposed to pass args as a string when using the shell in POSIX, so we may as well stay consistent with this in Windows. Practically no one wants the resulting behavior when passing a shell command as a list in POSIX. For example: >>> subprocess.call(['echo \\$0=$0 \\$1=$1', 'spam', 'eggs'], shell=True) $0=spam $1=eggs It's common to discourage using `shell=True` because it's considered insecure. One of the reasons to use CMD in Windows is that it tries ShellExecuteEx if CreateProcess fails. ShellExecuteEx supports "App Paths" commands, file actions (open, edit, print), UAC elevation (via "runas" or if requested by the manifest), protocols (including "shell:"), and opening folders in Explorer. It isn't a scripting language, however, so it doesn't pose the same risk as using CMD. Calling ShellExecuteEx could be integrated in subprocess as a new Popen parameter, such as `winshell` or `shellex`. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ctypes, memory mapped files and context manager
On Sun, Jan 8, 2017 at 8:25 AM, Armin Rigowrote: > > c_raw = ctypes.PYFUNCTYPE(ctypes.c_void_p, ctypes.c_void_p)(lambda p: p) Use ctypes.addressof. > addr = c_raw(ctypes.pointer(T.from_buffer(m))) > b = ctypes.cast(addr, ctypes.POINTER(T)).contents ctypes.cast uses an FFI call. In this case you can more simply use from_address: b = T.from_address(ctypes.addressof(T.from_buffer(m))) There's no supporting connection between b and m. If m was allocated from a heap/pool/freelist, as opposed to a separate mmap (VirtualAlloc) call, then you won't necessarily get a segfault (access violation) if b is used after m has been deallocated or internally realloc'd. It can lead to corrupt data and difficult to diagnose errors. You're lucky if it segfaults. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ctypes, memory mapped files and context manager
On Thu, Jan 5, 2017 at 11:28 PM, Hans-Peter Jansenwrote: > Leaves the question, how stable this "interface" is? > Accessing _objects here belongs to voodoo programming practices of course, but > the magic is locally limited to just two lines of code, which is acceptable in > order to get this context manager working without messing with the rest of the > code. My intent was not to suggest that anyone directly use the _objects value / dict in production code. It's a private implementation detail. I was demonstrating the problem of simply releasing the buffer and the large number of checks that would be required if b_ptr is cleared. It would be simpler for a release() method to allocate new memory for the object and set the b_needsfree flag, but this may hide bugs. Operating on a released object should raise an exception. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ctypes, memory mapped files and context manager
On Thu, Jan 5, 2017 at 2:37 AM, Nick Coghlanwrote: > On 5 January 2017 at 10:28, Hans-Peter Jansen wrote: >> In order to get this working properly, the ctypes mapping needs a method to >> free the mapping actively. E.g.: >> >> @contextmanager >> def map_struct(m, n): >> m.resize(n * mmap.PAGESIZE) >> yield T.from_buffer(m) >> T.unmap_buffer(m) >> >> Other attempts with weakref and the like do not work due to the nature of the >> ctypes types. > > I don't know ctypes well enough myself to comment on the idea of > offering fully deterministic cleanup, but the closest you could get to > that without requiring a change to ctypes is to have the context > manager introduce a layer of indirection: I think that's the best you can do with the current state of ctypes. from_buffer was made safer in Python 3 by ensuring it keeps a memoryview reference in the _objects attribute (i.e. CDataObject.b_objects). Hans-Peter's problem is a consequence of this reference. Simply calling release() on the underlying memoryview is unsafe. For example: >>> b = bytearray(2**20) >>> a = ctypes.c_char.from_buffer(b) >>> a._objects >>> a._objects.release() >>> del b >>> a.value Segmentation fault (core dumped) A release() method on ctypes objects could release the memoryview and also clear the CDataObject b_ptr field. In this case, any function that accesses b_ptr would have to be modified to raise a ValueError for a NULL value. Currently ctypes assumes b_ptr is valid, so this would require adding a lot of checks. On a related note, ctypes objects aren't tracking the number of exported views like they should. resize() should raise a BufferError in the following example: >>> b = (ctypes.c_char * (2**20))(255) >>> m = memoryview(b).cast('B') >>> m[0] 255 >>> ctypes.resize(b, 2**22) >>> m[0] Segmentation fault (core dumped) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8
On Mon, Sep 5, 2016 at 9:45 PM, Steve Dowerwrote: > > So it works, though the behaviour is a little strange when you do it from > the interactive prompt: > sys.stdin.buffer.raw.read(1) > ɒprint('hi') > b'\xc9' hi sys.stdin.buffer.raw.read(1) > b'\x92' > > What happens here is the raw.read(1) rounds one byte up to one character, > reads the turned alpha, returns a single byte of the two byte encoded form > and caches the second byte. Then interactive mode reads from stdin and gets > the rest of the characters, starting from the print() and executes that. > Finally the next call to raw.read(1) returns the cached second byte of the > turned alpha. > > This is basically only a problem because the readline implementation is > totally separate from the stdin object and doesn't know about the small > cache (and for now, I think it's going to stay that way - merging readline > and stdin would be great, but is a fairly significant task that won't make > 3.6 at this stage). It needs to read a minimum of 2 codes in case the first character is a lead surrogate. It can use a length 2 WCHAR buffer and remember how many bytes have been written (for the general case -- not specifically for this case). Example failure using your 3rd patch: >>> _ = write_console_input("\U0001print('hi')\r\n");\ ... raw_read(1) print('hi') b'\xef' >>> File "", line 1 �print('hi') ^ SyntaxError: invalid character in identifier >>> raw_read(1) b'\xbf' >>> raw_read(1) b'\xbd' The raw read captures the first surrogate code, and transcodes it as the replacement character b'\xef\xbf\xbd' (U+FFFD). Then PyOS_Readline captures the 2nd surrogate and decodes it as the replacement character. In the general case in which a lead surrogate is the last code read, but not at index 0, it can use the internal buffer to save the code for the next call. Surrogates that aren't in valid pairs should be allowed to pass through via surrogatepass. This aims for consistency with the filesystem encoding PEP. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8
On Mon, Sep 5, 2016 at 7:54 PM, Steve Dower <steve.do...@python.org> wrote: > On 05Sep2016 1234, eryk sun wrote: >> >> Also, the console is UCS-2, which can't be transcoded between UTF-16 >> and UTF-8. Supporting UCS-2 in the console would integrate nicely with >> the filesystem PEP. It makes it always possible to print >> os.listdir('.'), copy and paste, and read it back without data loss. > > Supporting UTF-8 actually works better for this. We already use > surrogatepass explicitly (on the filesystem side, with PEP 529) and > implicitly (on the console side, using the Windows conversion API). CP_UTF8 requires valid UTF-16 text. MultiByteToWideChar and WideCharToMultiByte are of no practical use here. For example: >>> raw_read = sys.stdin.buffer.raw.read >>> _ = write_console_input('\ud800\ud800\r\n'); raw_read(16) �� b'\xef\xbf\xbd\xef\xbf\xbd\r\n' This requires Python's "surrogatepass" error handler. It's also required to decode UTF-8 that's potentially WTF-8 from os.listdir(b'.'). Coming from the wild, there's a chance that arbitrary bytes have invalid sequences other than lone surrogates, so it needs to fall back on "replace" to deal with errors that "surrogatepass" doesn't handle. > Writing a partial character is easily avoidable by the user. We can either > fail with an error or print garbage, and currently printing garbage is the > most compatible behaviour. (Also occurs on Linux - I have a VM running this > week for testing this stuff.) Are you sure about that? The internal screen buffer of a Linux terminal is bytes; it doesn't transcode to a wide-character format. In the Unix world, almost everything is "get a byte, get a byte, get a byte, byte, byte". Here's what I see in Ubuntu using GNOME Terminal, for example: >>> raw_write = sys.stdout.buffer.raw.write >>> b = 'αβψδε\n'.encode() >>> b b'\xce\xb1\xce\xb2\xcf\x88\xce\xb4\xce\xb5\n' >>> for c in b: _ = raw_write(bytes([c])) ... αβψδε Here it is on Windows with your patch: >>> raw_write = sys.stdout.buffer.raw.write >>> b = 'αβψδε\n'.encode() >>> b b'\xce\xb1\xce\xb2\xcf\x88\xce\xb4\xce\xb5\n' >>> for c in b: _ = raw_write(bytes([c])) ... �� For the write case this can be addressed by identifying an incomplete sequence at the tail end and either buffering it as 'written' or rejecting it for the user/buffer to try again with the complete sequence. I think rejection isn't a good option when the incomplete sequence starts at index 0. That should be buffered. I prefer buffering in all cases. >> It would probably be simpler to use UTF-16 in the main pipeline and >> implement Martin's suggestion to mix in a UTF-8 buffer. The UTF-16 >> buffer could be renamed as "wbuffer", for expert use. However, if >> you're fully committed to transcoding in the raw layer, I'm certain >> that these problems can be addressed with small buffers and using >> Python's codec machinery for a flexible mix of "surrogatepass" and >> "replace" error handling. > > I don't think it actually makes things simpler. Having two buffers is > generally a bad idea unless they are perfectly synced, which would be > impossible here without data corruption (if you read half a utf-8 character > sequence and then read the wide buffer, do you get that character or not?). Martin's idea, as I understand it, is a UTF-8 buffer that reads from and writes to the text wrapper. It necessarily consumes at least one character and buffers it to allow reading per byte. Likewise for writing, it buffers bytes until it can write a character to the text wrapper. ISTM, it has to look for incomplete lead-continuation byte sequences at the tail end, to hold them until the sequence is complete, at which time it either decodes to a valid character or the U+FFFD replacement character. Also, I found that read(n) has to read a character at a time. That's the only way to emulate line-input mode to detect "\n" and stop reading. Technically this is implemented in a RawIOBase, which dictates that operations should use a single system call, but since it's interfacing with a text wrapper around a buffer around the actual UCS-2 raw console stream, any notion of a 'system call' would be a sham. Because of the UTF-8 buffering there is a synchronization issue, but it has character granularity. For example, when decoding UTF-8, you don't get half of a surrogate pair. You decode the full character, and write that as a discrete unit to the text wrapper. I'd have to experiment to see how bad this can get. If it's too confusing the idea isn't practical. On the plus side, when working with text it's all native UCS-2 up to the TextIOWrapper, so it's as
Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8
I have some suggestions. With ReadConsoleW, CPython can use the pInputControl parameter to set a CtrlWakeup mask. This enables a Unix-style Ctrl+D for ending a read without having to press enter. For example: >>> CTRL_MASK = 1 << 4 >>> inctrl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0) >>> _ = kernel32.ReadConsoleW(hStdIn, buf, 100, pn, inctrl); print() spam >>> buf.value 'spam\x04' >>> pn[0] 5 read() would have to manually replace '\x04' with NUL. Ctrl+Z can also be added to the mask: >>> CTRL_MASK = 1 << 4 | 1 << 26 >>> inctrl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0) >>> _ = kernel32.ReadConsoleW(hStdIn, buf, 100, pn, inctrl); print() spam >>> buf.value 'spam\x1a' I'd like a method to query, set and unset ENABLE_VIRTUAL_TERMINAL_PROCESSING mode for the screen buffer (sys.stdout and sys.stderr) without having to use ctypes. The console in Windows 10 has built-in VT100 emulation, but it's initially disabled. The cmd shell enables it, but Python scripts aren't always run from cmd.exe. Sometimes they're run in a new console from Explorer or via "start", etc. For example, IPython could check for this to provide more bells and whistles when PyReadline isn't installed. Finally, functions such as WriteConsoleInputW and ReadConsoleOutputCharacter require opening CONIN$ or CONOUT$ with GENERIC_READ | GENERIC_WRITE access. The initial handles given to a console process have read-write access. For opening a new handle by device name, WindowsConsoleIO should first try GENERIC_READ | GENERIC_WRITE -- with a fallback to either GENERIC_READ or GENERIC_WRITE. The fallback is necessary for CON, which uses the desired access to determine whether to open the input buffer or screen buffer. --- Paul, do you have example code that uses the 'raw' stream? Using the buffer should behave as it always has -- at least in this regard. sys.stdin.buffer requests a large block, such as 8 KB. But since the console defaults to a cooked mode (i.e. processed input and line input -- control keys, command-line editing, input history, and aliases), ReadConsole returns when enter is pressed or when interrupted. It returns at least '\r\n', unless interrupted by Ctrl+C, Ctrl+Break or a custom CtrlWakeup key. However, if line-input mode is disabled, ReadConsole returns as soon as one or more characters is available in the input buffer. As to kbhit() returning true, this does not mean that read(1) from console input won't block (not unless line-input mode is disabled). It does mean that getwch() won't block (note the "w" in there; this one reads Unicode characters).The CRT's conio functions (e.g. kbhit, getwch) put the console input buffer in a raw mode (e.g. ^C is read as '\x03' instead of generating a CTRL_C_EVENT) and call the lower-level functions PeekConsoleInputW (kbhit) and ReadConsoleInputW (getwch), to peek at and read input event records. --- Splitting surrogate pairs across reads is a problem. Granted, this should rarely be an issue given the size of the reads that the buffer requests and the typical line length. In most cases the buffer completely consumes the entire line in one read. But in principle the raw stream shouldn't replace split surrogates with the U+FFFD replacement character. For example, with Steve's patch from issue 1602: >>> _ = write_console_input('\U0001\r\n');\ ... b1 = raw_read(4); b2 = raw_read(4); b3 = raw_read(8) >>> b1, b2 (b'\xef\xbf\xbd', b'\xef\xbf\xbd') Splitting UTF-8 sequences across writes is more common. Currently a raw write doesn't handle this correctly: >>> b = 'eggs \U0001 spam\n'.encode('utf-8') >>> _ = raw_write(b[:6]); _ = raw_write(b[6:]) eggs spam Also, the console is UCS-2, which can't be transcoded between UTF-16 and UTF-8. Supporting UCS-2 in the console would integrate nicely with the filesystem PEP. It makes it always possible to print os.listdir('.'), copy and paste, and read it back without data loss. It would probably be simpler to use UTF-16 in the main pipeline and implement Martin's suggestion to mix in a UTF-8 buffer. The UTF-16 buffer could be renamed as "wbuffer", for expert use. However, if you're fully committed to transcoding in the raw layer, I'm certain that these problems can be addressed with small buffers and using Python's codec machinery for a flexible mix of "surrogatepass" and "replace" error handling. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] File system path encoding on Windows
On Mon, Aug 22, 2016 at 3:58 PM, Steve Dowerwrote: > All MSVC users have been pushed towards Unicode for many years. The .NET > Framework has defaulted to UTF-8 its entire existence. The use of code pages > has been discouraged for decades. We're not going first :) I just wrote a simple function to enumerate the 822 system locales on my Windows box (using EnumSystemLocalesEx and GetLocaleInfoEx, which are Unicode-only functions), and 36.7% of them lack an ANSI codepage. They're Unicode-only locales. UTF-8 is the only way to support these locales with a bytes API. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the Argument Clinic DSL
On Thu, Aug 4, 2016 at 11:33 PM, Alexander Belopolskywrote: > > On Thu, Aug 4, 2016 at 7:12 PM, Larry Hastings wrote: >> >> C extension functions get the module passed in automatically, but this is >> done internally and from the Python level you can't see it. > > Always something new to learn! This was not so in Python 2.x - self was > passed as NULL to the C module functions. When did this change? In 2.x this is the `self` parameter (actually named "passthrough" in the source) of Py_InitModule4 [1, 2]. You probably use the Py_InitModule or Py_InitModule3 macros, which pass NULL for this parameter: #define Py_InitModule(name, methods) \ Py_InitModule4(name, methods, (char *)NULL, (PyObject *)NULL, \ PYTHON_API_VERSION) #define Py_InitModule3(name, methods, doc) \ Py_InitModule4(name, methods, doc, (PyObject *)NULL, \ PYTHON_API_VERSION) Python 3's PyModule_Create2 [3-5] API makes this a reference to the module. It's currently implemented in PyModule_AddFunctions [6, 7]. [1]: https://docs.python.org/2/c-api/allocation.html#c.Py_InitModule4 [2]: https://hg.python.org/cpython/file/v2.7.12/Python/modsupport.c#l31 [3]: https://docs.python.org/3/c-api/module.html#c.PyModule_Create2 [4]: https://hg.python.org/cpython/file/v3.5.2/Objects/moduleobject.c#l133 [5]: https://hg.python.org/cpython/file/v3.0b1/Objects/moduleobject.c#l63 [6]: https://docs.python.org/3/c-api/module.html#c.PyModule_AddFunctions [7]: https://hg.python.org/cpython/file/v3.5.2/Objects/moduleobject.c#l387 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Windows: Remove support of bytes filenames in theos module?
On Wed, Feb 10, 2016 at 2:30 PM, Andrew Barnert via Python-Devwrote: > [^3]: Say you write a program that assumes it will only be run on Shift-JIS > systems, and you use > CreateFileA to create a file named "ハローワールド". The actual bytes you're sending > are cp436 > for "ânâìü[âÅü[âïâh", so the file on the CD is named, in Unicode, > "ânâìü[âÅü[âïâh". Unless the system default was changed or the program called SetFileApisToOEM, CreateFileA would decode using the ANSI codepage 1252, not the OEM codepage 437 (not 436), i.e. "ƒnƒ\x8d\x81[ƒ\x8f\x81[ƒ‹ƒh". Otherwise the example is right. But the transcoding strategy won't work in general. For example, if the tables are turned such that the ANSI codepage is 932 and the program passes a bytes name from codepage 1252, the user on the other end won't be able to transcode without error if the original bytes contained invalid DBCS sequences that were mapped to the default character, U+30FB. This transcodes as the meaningless string "\x81E". The user can replace that string with "--" and enjoy a nice game of hang man. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Windows: Remove support of bytes filenames in the os module?
On Tue, Feb 9, 2016 at 3:22 AM, Victor Stinner <victor.stin...@gmail.com> wrote: > 2016-02-09 1:37 GMT+01:00 eryk sun <eryk...@gmail.com>: >> For example, in codepage 932 (Japanese), it's an error if a lead byte >> (i.e. 0x81-0x9F, 0xE0-0xFC) is followed by a trailing byte with a >> value less than 0x40 (note that ASCII 0-9 is 0x30-0x39, so this is not >> uncommon). In this case the ANSI API substitutes the default character >> for Japanese, '・' (U+30FB, Katakana middle dot). >> >> >>> locale.getpreferredencoding() >> 'cp932' >> >>> open(b'\xe05', 'w').close() >> >>> os.listdir('.') >> ['・'] >> >>> os.listdir(b'.') >> [b'\x81E'] >> >> All invalid sequences get mapped to '・', which roundtrips as >> b'\x81\x45', so you can't reliably create and open files with >> arbitrary bytes paths in this locale. > > Oh, and I forgot to ask: what is your filesystem? Is it the same > behaviour for NTFS, FAT32, network shared directories, etc.? That was tested using NTFS, but the same would apply to FAT32, exFAT, and UDF since they all use Unicode [1]. CreateFile[A|W] wraps the NtCreateFile system call. The NT executive is Unicode, so the system call receives the filename using a Unicode-only OBJECT_ATTRIBUTES [2] record. I can't say what an arbitrary non-Microsoft filesystem will do with the U+30FB character when it processes the IRP_MJ_CREATE. I was only concerned with ANSI<=>Unicode conversion that's implemented in the ntdll.dll runtime library. [1]: https://msdn.microsoft.com/en-us/library/ee681827 [2]: https://msdn.microsoft.com/en-us/library/ff557749 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Windows: Remove support of bytes filenames in the os module?
On Tue, Feb 9, 2016 at 3:21 AM, Victor Stinner <victor.stin...@gmail.com> wrote: > 2016-02-09 1:37 GMT+01:00 eryk sun <eryk...@gmail.com>: >> For example, in codepage 932 (Japanese), it's an error if a lead byte >> (i.e. 0x81-0x9F, 0xE0-0xFC) is followed by a trailing byte with a >> value less than 0x40 (note that ASCII 0-9 is 0x30-0x39, so this is not >> uncommon). In this case the ANSI API substitutes the default character >> for Japanese, '・' (U+30FB, Katakana middle dot). >> >> >>> locale.getpreferredencoding() >> 'cp932' >> >>> open(b'\xe05', 'w').close() >> >>> os.listdir('.') >> ['・'] >> >>> os.listdir(b'.') >> [b'\x81E'] > > Hum, I'm not sure that I understand your example. Say I create a sequence of files with the names "file_à[N].txt" encoded in Latin-1, where N is 0-2. They all map to the same file in a Japanese system locale: >>> open(b'file_\xe00.txt', 'w').close(); os.listdir('.') ['file_・.txt'] >>> open(b'file_\xe01.txt', 'w').close(); os.listdir('.') ['file_・.txt'] >>> open(b'file_\xe02.txt', 'w').close(); os.listdir('.') ['file_・.txt'] >>> os.listdir(b'.') [b'file_\x81E.txt'] This isn't a problem with a single-byte codepage such as 1251. For example, codepage 1251 doesn't map b"\x98" to any character, but harmlessly maps it to "\x98" (SOS in the C1 Controls block). Single-byte code pages still have the problem that when a filename is created using the wide-character API, listing it as bytes may use either an approximate mapping (e.g. "à" => "a" in 1251) or the codepage default character (e.g. "\xd7" => "?" in 1251). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Windows: Remove support of bytes filenames in the os module?
On Mon, Feb 8, 2016 at 2:41 PM, Chris Barkerwrote: > Just to clarify -- what does it currently do for bytes? IIUC, Windows uses > UTF-16, so can you pass in UTF-16 bytes? Or when using bytes is is assuming > some Windows ANSI-compatible encoding? (and what does it return?) UTF-16 is used in the [W]ide-character API. Bytes paths use the [A]NSI codepage. For a single-byte codepage, the ANSI API rountrips, i.e. a bytes path that's passed to CreateFileA matches the listing from FindFirstFileA. But for a DBCS codepage arbitrary bytes paths do not roundtrip. Invalid byte sequences map to the default character. Note that an ASCII question mark is not always the default character. It depends on the codepage. For example, in codepage 932 (Japanese), it's an error if a lead byte (i.e. 0x81-0x9F, 0xE0-0xFC) is followed by a trailing byte with a value less than 0x40 (note that ASCII 0-9 is 0x30-0x39, so this is not uncommon). In this case the ANSI API substitutes the default character for Japanese, '・' (U+30FB, Katakana middle dot). >>> locale.getpreferredencoding() 'cp932' >>> open(b'\xe05', 'w').close() >>> os.listdir('.') ['・'] >>> os.listdir(b'.') [b'\x81E'] All invalid sequences get mapped to '・', which roundtrips as b'\x81\x45', so you can't reliably create and open files with arbitrary bytes paths in this locale. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When does `PyType_Type.tp_alloc get assigned to PyType_GenericAlloc ?
On Sun, Feb 7, 2016 at 7:58 AM, Randy Eelswrote: > > Yet, I can't seem to understand where and when does the `tp_alloc` slot of > PyType_Type get re-assigned to PyType_GenericAlloc. Does that even happen? > Or am I missing something bigger? _Py_InitializeEx_Private in Python/pylifecycle.c calls _Py_ReadyTypes in Objects/object.c. This calls PyType_Ready(_Type) in Objects/typeobject.c, which assigns type->tp_base = _Type and then calls inherit_slots. This executes COPYSLOT(tp_alloc), which assigns PyType_Type.tp_alloc = PyBaseObject_Type.tp_alloc, which is statically assigned as PyType_GenericAlloc. Debug trace on Windows: 0:000> bp python35!PyType_Ready 0:000> g Breakpoint 0 hit python35!PyType_Ready: `6502d160 4053pushrbx 0:000> ?? ((PyTypeObject *)@rcx)->tp_name char * 0x`650e4044 "object" 0:000> g Breakpoint 0 hit python35!PyType_Ready: `6502d160 4053pushrbx 0:000> ?? ((PyTypeObject *)@rcx)->tp_name char * 0x`651d8e5c "type" 0:000> bp python35!inherit_slots 0:000> g Breakpoint 1 hit python35!inherit_slots: `6502c440 48895c2408 mov qword ptr [rsp+8],rbx ss:`0028f960={ python35!PyType_Type (`6527cba0)} At entry to inherit_slots, PyType_Type.tp_alloc is NULL: 0:000> ?? python35!PyType_Type.tp_alloc * 0x` 0:000> pt python35!inherit_slots+0xd17: `6502d157 c3 ret At exit it's set to PyType_GenericAlloc: 0:000> ?? python35!PyType_Type.tp_alloc * 0x`65025580 0:000> ln 65025580 (`65025580) python35!PyType_GenericAlloc | (`650256a0) python35!PyType_GenericNew Exact matches: python35!PyType_GenericAlloc (void) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python environment registration in the Windows Registry
On Wed, Feb 3, 2016 at 7:33 PM, Eric Snowwrote: > Just wanted to quickly point out another use of the WIndows registry > in Python: WindowsRegistryFinder [1]. This is an import "meta-path" > finder that locates modules declared (*not* defined) in the registry. > I'm not familiar with the Windows registry nor do I know if anyone is > using this finder. The "Modules" key (WindowsRegistryFinder in 3.3+ and previously PyWin_FindRegisteredModule) adds individual modules by subkey name, with the filepath in the default value (the filename can differ, but it can't use an arbitrary extension). The "PythonPath" and "Modules" keys both date back to Mark Hammond's Windows port in the mid 1990s. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python environment registration in the Windows Registry
On Wed, Feb 3, 2016 at 10:46 AM, Steve Dowerwrote: > > sys.path.extend(read_subkeys(fr'HKCU\Software\Python\PythonCore\{sys.winver}\PythonPath\**')) > sys.path.extend(read_subkeys(fr'HKLM\Software\Python\PythonCore\{sys.winver}\PythonPath\**')) It seems like a bug (in spirit at least) that this step isn't skipped for -E and -I (Py_IgnoreEnvironmentFlag, Py_IsolatedFlag). > I haven't looked into pywin32's use of this recently - I tend to only use > Christoph Gohlke's wheels that don't register anything. I install the pypiwin32 wheel using pip, which uses pypiwin32.pth: # .pth file for the PyWin32 extensions win32 win32\lib Pythonwin import os;os.environ["PATH"]+=(';'+os.path.join(sitedir,"pypiwin32_system32")) This is different from a PythonPath subkey in a couple of respects. The paths listed in .pth files are appended to sys.path instead of prepended. They also don't get added when run with -S or for a venv environment that excludes site-packages. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com