[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Steve Dower

On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:
TLDR: In os.scandir directory entries, atime is always a copy of mtime 
rather than the actual access time.


Correction - os.stat() updates the access time to _now_, while 
os.scandir() returns the last access time without updating it.


Eryk replied with a deeper explanation of the cause, but fundamentally 
this is what you are seeing.


Feel free to file a bug, but we'll likely only add a vague note to the 
docs about how Windows works here rather than changing anything. If 
anything, we should probably fix os.stat() to avoid updating the access 
time so that both functions behave the same, but that might be too 
complicated.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NGMVB7GWDBCPYHL4IND2LBZ2QPXLWRAX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Steve Dower

On 19Oct2020 1242, Steve Dower wrote:

On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:
TLDR: In os.scandir directory entries, atime is always a copy of mtime 
rather than the actual access time.


Correction - os.stat() updates the access time to _now_, while 
os.scandir() returns the last access time without updating it.


Let me correct myself first :)

*Windows* has decided not to update file access time metadata *in 
directory entries* on reads. os.stat() always[1] looks at the file entry 
metadata, while os.scandir() always looks at the directory entry metadata.


My suggested approach still applies, other than the bit where we might 
fix os.stat(). The best we can do is regress os.scandir() to have 
similarly poor performance, but the best *you* can do is use os.stat() 
for accurate timings when files might be being modified while your 
program is running, and don't do it when you just need names/kinds (and 
I'm okay adding that note to the docs).


Cheers,
Steve

[1]: With some fallback to directory entries in exceptional cases that 
don't apply here.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QHHJFYEDBANW7EC3JOUFE7BQRT5ILL4O/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Ivan Pozdeev via Python-Dev



On 19.10.2020 14:47, Steve Dower wrote:

On 19Oct2020 1242, Steve Dower wrote:

On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:

TLDR: In os.scandir directory entries, atime is always a copy of mtime rather 
than the actual access time.


Correction - os.stat() updates the access time to _now_, while os.scandir() 
returns the last access time without updating it.


Let me correct myself first :)

*Windows* has decided not to update file access time metadata *in directory entries* on reads. os.stat() always[1] looks at the file entry 
metadata, while os.scandir() always looks at the directory entry metadata.


Is this behavior documented somewhere?

Such weirdness certaintly something that needs to be documented but I really don't like describing such quirks that are out of our control 
and may be subject to change in Python documentation. So we should only consider doing so if there are no other options.





My suggested approach still applies, other than the bit where we might fix os.stat(). The best we can do is regress os.scandir() to have 
similarly poor performance, but the best *you* can do is use os.stat() for accurate timings when files might be being modified while your 
program is running, and don't do it when you just need names/kinds (and I'm okay adding that note to the docs).


Cheers,
Steve

[1]: With some fallback to directory entries in exceptional cases that don't 
apply here.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QHHJFYEDBANW7EC3JOUFE7BQRT5ILL4O/
Code of Conduct: http://python.org/psf/codeofconduct/
--
Regards,
Ivan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VFXDBURSZ4QKA6EQBZLU6K4FKMGZPSF5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Random832
On Mon, Oct 19, 2020, at 07:42, Steve Dower wrote:
> On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:
> > TLDR: In os.scandir directory entries, atime is always a copy of mtime 
> > rather than the actual access time.
> 
> Correction - os.stat() updates the access time to _now_, while 
> os.scandir() returns the last access time without updating it.

This is surprising - do we know why this happens?

Also, it doesn't seem true on my system with python 3.8.5 [and, yes, I checked 
that last access update is enabled for my test and updates normally when 
reading the file's contents].
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GX3KD4UQKJONCLOZY743WXNGENXL7YG2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 638: Syntactic macros

2020-10-19 Thread Random832
On Fri, Oct 16, 2020, at 18:59, Dan Stromberg wrote:
> The complexity of a language varies with the square of its feature 
> count,

Says who? I'd assume the orthogonality and regularity of features matters at 
least as much if not more than the number of features, and providing a system 
like this would guarantee some degree of regularity.

Is there some notion of "complexity of a language" [other than by trivially 
*defining* it as the square of the number of features] for which this can be 
shown to be true?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JVJDRCNNSDMDIIFWJ65MBQVTWC375SJW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 638: Syntactic macros

2020-10-19 Thread Martin (gzlist) via Python-Dev
On Fri, 16 Oct 2020 at 23:22, Guido van Rossum  wrote:
>
> Dima,
>
> Do you have a link to "babel macros"? Searching for that brought up several 
> different things; not being a frequent JS user I don't know how to filter 
> these.

These links should help:

https://babeljs.io/blog/2017/09/11/zero-config-with-babel-macros
https://github.com/kentcdodds/babel-plugin-macros
https://github.com/jgierer12/awesome-babel-macros

That's a general intro, the code repo for the macro plugin, and a repo
that lists implemented macros.

Martin
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HJR6UF2XPOC6UUY6QRLKYCYZOWK2BYDD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Gregory P. Smith
On Mon, Oct 19, 2020 at 6:28 AM Ivan Pozdeev via Python-Dev <
python-dev@python.org> wrote:

>
> On 19.10.2020 14:47, Steve Dower wrote:
> > On 19Oct2020 1242, Steve Dower wrote:
> >> On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:
> >>> TLDR: In os.scandir directory entries, atime is always a copy of mtime
> rather than the actual access time.
> >>
> >> Correction - os.stat() updates the access time to _now_, while
> os.scandir() returns the last access time without updating it.
> >
> > Let me correct myself first :)
> >
> > *Windows* has decided not to update file access time metadata *in
> directory entries* on reads. os.stat() always[1] looks at the file entry
> > metadata, while os.scandir() always looks at the directory entry
> metadata.
>
> Is this behavior documented somewhere?
>
> Such weirdness certaintly something that needs to be documented but I
> really don't like describing such quirks that are out of our control
> and may be subject to change in Python documentation. So we should only
> consider doing so if there are no other options.
>

I'm sure this is covered in MSDN.  Linking to that if it has it in a
concise explanation would make sense from a note in our docs.

If I'm understanding Steve correctly this is due to Windows/NTFS storing
the access time potentially redundantly in two different places. One within
the directory entry itself and one with the file's own metadata.  Those of
us with a traditional posix filesystem background may raise eyeballs at
this duplication, seeing a directory as a place that merely maps names to
inodes with the inode structure (equiv: file entry metadata) being the sole
source of truth.  Which ones get updated when and by what actions is up to
the OS.

So yes, just document the "quirk" as an intended OS behavior.  This is one
reason scandir() can return additional information on windows vs what it
can return on posix.  The entire point of scandir() is to return as much as
possible from the directory without triggering reads of the
inodes/file-entry-metadata. :)

-gps


>
> >
> > My suggested approach still applies, other than the bit where we might
> fix os.stat(). The best we can do is regress os.scandir() to have
> > similarly poor performance, but the best *you* can do is use os.stat()
> for accurate timings when files might be being modified while your
> > program is running, and don't do it when you just need names/kinds (and
> I'm okay adding that note to the docs).
> >
> > Cheers,
> > Steve
> >
> > [1]: With some fallback to directory entries in exceptional cases that
> don't apply here.
> > ___
> > Python-Dev mailing list -- python-dev@python.org
> > To unsubscribe send an email to python-dev-le...@python.org
> > https://mail.python.org/mailman3/lists/python-dev.python.org/
> > Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/QHHJFYEDBANW7EC3JOUFE7BQRT5ILL4O/
> > Code of Conduct: http://python.org/psf/codeofconduct/
> > --
> > Regards,
> > Ivan
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/VFXDBURSZ4QKA6EQBZLU6K4FKMGZPSF5/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IZ6KSRTJLORCB33OMVUPFYQYLMBM26EJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Mats Wichmann
On 10/19/20 9:52 AM, Gregory P. Smith wrote:
> 
> 
> On Mon, Oct 19, 2020 at 6:28 AM Ivan Pozdeev via Python-Dev
> mailto:python-dev@python.org>> wrote:
> 
> 
> On 19.10.2020 14:47, Steve Dower wrote:
> > On 19Oct2020 1242, Steve Dower wrote:
> >> On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:
> >>> TLDR: In os.scandir directory entries, atime is always a copy of
> mtime rather than the actual access time.
> >>
> >> Correction - os.stat() updates the access time to _now_, while
> os.scandir() returns the last access time without updating it.
> >
> > Let me correct myself first :)
> >
> > *Windows* has decided not to update file access time metadata *in
> directory entries* on reads. os.stat() always[1] looks at the file
> entry
> > metadata, while os.scandir() always looks at the directory entry
> metadata.
> 
> Is this behavior documented somewhere?
> 
> Such weirdness certaintly something that needs to be documented but
> I really don't like describing such quirks that are out of our control
> and may be subject to change in Python documentation. So we should
> only consider doing so if there are no other options.
> 
> 
> I'm sure this is covered in MSDN.  Linking to that if it has it in a
> concise explanation would make sense from a note in our docs.
> 
> If I'm understanding Steve correctly this is due to Windows/NTFS storing
> the access time potentially redundantly in two different places. One
> within the directory entry itself and one with the file's own metadata. 
> Those of us with a traditional posix filesystem background may raise
> eyeballs at this duplication, seeing a directory as a place that merely
> maps names to inodes with the inode structure (equiv: file entry
> metadata) being the sole source of truth.  Which ones get updated when
> and by what actions is up to the OS.
> 
> So yes, just document the "quirk" as an intended OS behavior.  This is
> one reason scandir() can return additional information on windows vs
> what it can return on posix.  The entire point of scandir() is to return
> as much as possible from the directory without triggering reads of the
> inodes/file-entry-metadata. :)
> 
> -gps

depending on atimes isn't a consistently reliable mechanism anyway,
since filesystems on Linux et. al. are allowed to be mounted so as to
not independently update access times.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QXNHYK6NDECISIOZVO4BCW2O6UXRZJGO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Steve Dower

On 19Oct2020 1652, Gregory P. Smith wrote:
I'm sure this is covered in MSDN.  Linking to that if it has it in a 
concise explanation would make sense from a note in our docs.


Probably unlikely :) I'm pretty sure this started "perfect" and was then 
wound back to improve performance. But it's almost certainly an option 
somewhere, which means you can't rely on it being either true nor false. 
You just have to be explicit for certain pieces of information.


If I'm understanding Steve correctly this is due to Windows/NTFS storing 
the access time potentially redundantly in two different places. One 
within the directory entry itself and one with the file's own metadata.  
Those of us with a traditional posix filesystem background may raise 
eyeballs at this duplication, seeing a directory as a place that merely 
maps names to inodes with the inode structure (equiv: file entry 
metadata) being the sole source of truth.  Which ones get updated when 
and by what actions is up to the OS.


So yes, just document the "quirk" as an intended OS behavior.  This is 
one reason scandir() can return additional information on windows vs 
what it can return on posix.  The entire point of scandir() is to return 
as much as possible from the directory without triggering reads of the 
inodes/file-entry-metadata. :)


Yeah, I'd document it as a quirk of scandir. There's also a race where 
if you scandir(), then someone touches the file, then you look at the 
cached stat you get the wrong information too (an any platform). Making 
clearer that it's for non-time sensitive queries is most accurate, 
though we could also give an example of "access times may not be up to 
date depending on OS-level caching" without committing us to being 
responsible for OS decisions.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EBWUDEQEPRWJN36FLUUJQWP5EWLPWRPD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Eryk Sun
On 10/19/20, Steve Dower  wrote:
> On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:
>> TLDR: In os.scandir directory entries, atime is always a copy of mtime
>> rather than the actual access time.
>
> Correction - os.stat() updates the access time to _now_, while
> os.scandir() returns the last access time without updating it.

os.stat() shouldn't affect st_atime because it doesn't access the file
data. That has me curious if it can be reproduced.

With NTFS in Windows 10, I'd expect the os.stat() st_atime to change
immediately when the file data is read or modified. With other
filesystems, it may not be updated until the kernel file object that
was used to access the file's data is closed.

Note that updating the access time in NTFS can be disabled by the
"NtfsDisableLastAccessUpdate" value in
"HKLM\System\CurrentControlSet\Control\FileSystem". The default value
in Windows 10 should be 0x8002, which means the value is system
managed and updating the access time is enabled.

If it's only the access time that changes, the directory entry may be
updated with a significant granularity such as hourly or daily. For
NTFS, it's hourly. To confirm this, wait an hour from the current
access time in the directory entry; open the file; read some data; and
close the file. The access time in the directory entry should be
updated.

For details, download the [MS-FSA] PDF [1] and look for all references
to the following sections:

* 2.1.4.17 Algorithm for Noting That a File Has Been Modified
* 2.1.4.19 Algorithm for Noting That a File Has Been Accessed
* 2.1.4.18 Algorithm for Updating Duplicated Information

Also check the tables in Appendix A, which provide the update
granularity of file time stamps (presumably for directory entries) for
common Windows filesystems.

[1] 
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fsa/860b1516-c452-47b4-bdbc-625d344e2041

Going back to my initial message, I can't stress enough that this
problem is at its worst when a file has multiple hardlinks. If a
particular link in a directory wasn't the last link used to access the
file, its duplicated metadata may have the wrong file size, access
time, modify time, and change time (the latter is not reported by
Python). As is, for the current implementation, I'd only rely on the
basic attributes such as whether it's a directory or reparse point
(symlink, mountpoint, etc) when using scandir() to quickly process a
directory. For reliable stat information, call os.stat().

I do think, however, that os.scandir() can be improved in Windows
without significant performance loss if it calls GetFileAttributesExW
to get st_file_attributes, st_size, st_ctime (create time), st_mtime,
and st_atime. This API call is relatively fast because it doesn't
require opening the file via CreateFileW, which is one of the more
expensive operations in os.stat(). But I haven't tried modifying
scandir() to benchmark it.

Ultimately, I'm waiting for Windows 10 to provide a WinAPI function
that calls the relatively new NTAPI function NtQueryInformationByName
[2] (by name, not by handle!) to get the FileStatInformation, as well
as for this information to be made available by handle via
GetFileInformationByHandleEx. Compared to GetFileAttributesExW, the
FileStatInformation additionally provides the file ID (if implemented
by the filesystem), change time, reparse tag, number of links, and the
effective access of the security context of the caller (i.e. process
or thread access token). The latter is something that we've never
impemented with os.stat(). It's not the same as POSIX
owner-group-other permissions. It would need a new attribute such as
st_effective_access. It could be used to provide a real implementation
of os.access() in Windows.

https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-ntqueryinformationbyname
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NPP6GKAEI7SOVA45WTJ222YVEALTF6WO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 11: Drop support for AIX releases without dlopen

2020-10-19 Thread Brett Cannon
On Fri, Oct 16, 2020 at 2:52 PM Kevin Adler 
wrote:

> Interesting. Given that, shouldn't PEP 11 be updated with that change?
> Seems to me that PEP 11 only documents platforms with *official support*,
> so is AIX officially supported? The comment in the issue would indicate it
> is not officially supported


AIX is not officially supported. We have tried to be helpful and add/remove
things over the years related to AIX (we used to have an external
contributor who actively tried to keep AIX supported), but we don't
guarantee things work since there is no core dev available to try and keep
AIX running.


> , but it _is_ listed here:
> https://pythondev.readthedocs.io/platforms.html#python-platforms


That is not an official Python website.

-Brett


>
>
> Batuhan Taskaya wrote:
> > As far as I am aware, we already dropped support for AIX 5.3<=.  See
> > https://bugs.python.org/issue40680 for details.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/KD4FVVQAAT5GCF6R3UXGPAEURWN3QUN6/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3YVCJ5I5WNVLNIZMHZJERZM2SLJXAUTM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Steve Dower

On 19Oct2020 1846, Eryk Sun wrote:

On 10/19/20, Steve Dower  wrote:

On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:

TLDR: In os.scandir directory entries, atime is always a copy of mtime
rather than the actual access time.


Correction - os.stat() updates the access time to _now_, while
os.scandir() returns the last access time without updating it.


os.stat() shouldn't affect st_atime because it doesn't access the file
data. That has me curious if it can be reproduced.

With NTFS in Windows 10, I'd expect the os.stat() st_atime to change
immediately when the file data is read or modified. With other
filesystems, it may not be updated until the kernel file object that
was used to access the file's data is closed.


I thought I got my self-correction fired off quickly enough to save you 
from writing this :)



For details, download the [MS-FSA] PDF [1] and look for all references
to the following sections:



[1] 
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fsa/860b1516-c452-47b4-bdbc-625d344e2041


Thanks for the detailed reference.


Going back to my initial message, I can't stress enough that this
problem is at its worst when a file has multiple hardlinks. If a
particular link in a directory wasn't the last link used to access the
file, its duplicated metadata may have the wrong file size, access
time, modify time, and change time (the latter is not reported by
Python). As is, for the current implementation, I'd only rely on the
basic attributes such as whether it's a directory or reparse point
(symlink, mountpoint, etc) when using scandir() to quickly process a
directory. For reliable stat information, call os.stat().

I do think, however, that os.scandir() can be improved in Windows
without significant performance loss if it calls GetFileAttributesExW
to get st_file_attributes, st_size, st_ctime (create time), st_mtime,
and st_atime. This API call is relatively fast because it doesn't
require opening the file via CreateFileW, which is one of the more
expensive operations in os.stat(). But I haven't tried modifying
scandir() to benchmark it.


Resolving the path is the most expensive part, even if the file is not 
opened (I've been working with the NTFS team on this area, and we've 
been benchmarking/analysing all of it). There are a few improvements 
coming across the board, but I'd much rather just emphasise that 
os.scandir() is as fast as we can manage using cached information 
(including as cached by the OS). Otherwise we prevent people from using 
the fastest available option when they can, if they don't need the 
additional information/accuracy.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MMRMLWGEV2ZGIACXQTSEQC6TPWGL3UZ3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Ideas for improving the contribution experience

2020-10-19 Thread Brett Cannon
I think the other way to help is to really lean into automation so
reviewing is even lighterweight than it is now. Now this can be as simple
as to remind people when they need to regenerate a file like 'configure'
via a status check, simply telling people when their PR failed a status
check, or go as far as try to automate PEP 7/8 stuff.

Or another way to put it is to try and use automation to help make PRs more
self-service so that there's less required from a human being to do.

On Fri, Oct 16, 2020 at 3:31 PM Tal Einat  wrote:

> (Context: Continuing to prepare for the core dev sprint next week. Since
> the sprint is near, *I'd greatly appreciate any quick comments, feedback
> and ideas!*)
>
> Following up my collection of past beginning contributor experiences, I've
> collected these experiences in a dedicated GitHub repo[1] and written a
> (subjective!) summary of main themes that I recognize in the stories, which
> I've also included in the repo[2].
>
> A "TL;DR" bullet list of those main themes:
> * Slow/no responsiveness
> * Long, slow process
> * Hard to find where to contribute
> * Mentorship helps a lot, but is scarce
> * A lot to learn to get started
> * It's intimidating
>
> More specifically, something that has come up often is that maintaining
> momentum for new contributors is crucial for them to become long-term
> contributors. Most often, this comes up in relation to the first two
> points: Suggestions or PRs are completely receive no attention at all
> ("ignored") or stop receiving attention at some point ("lost to the void").
> Unfortunately, the probability of this is pretty high for any issue/PR, so
> for a new contributor this is almost guaranteed to happen while working on
> one of their first few contributions. I've seen this happen many times, and
> have found that I have to personally follow promising contributors' work to
> ensure that this doesn't happen to them. I've also seen contributors learn
> to actively seek out core devs when these situations arise, which is often
> a successful tactic, but shouldn't be necessary so often.
>
> Now, this is in large part a result of the fact that us core devs are not
> a very large group, made up almost entirely of volunteers working on this
> in their spare time. Last I checked, the total amount of paid development
> time dedicated to developing Python is less than 3 full-time (i.e. ~100
> hours a week).
>
> The situation being problematic is clear enough that the PSF had concrete
> plans to hire paid developers to review issues and PRs. However, those
> plans have been put on hold indefinitely, since the PSF's funding has
> shrunk dramatically since the COVID-19 outbreak (no PyCon!).
>
> So, what can be done? Besides raising more funds (see a note on this
> below), I think we can find ways to reduce how often issues/PRs become
> "stalled". Here are some ideas:
>
> 1. *Generate reminders for reviewers when an issue or PR becomes
> "stalled' due to them.* Personally, I've found that both b.p.o. and
> GitHub make it relatively hard to remember to follow up on all of the many
> issues/PRs you've taken part in reviewing. It takes considerable attention
> and discipline to do so consistently, and reminders like these would have
> helped me. Many (many!) times, all it took to get an issue/PR moving
> forward (or closed) was a simple "ping?" comment.
>
> 2. *Generate reminders for contributors when an issue or PR becomes
> "stalled" due to them.* Similar to the above, but I consider these
> separate.
>
> 3. *Advertise something like a "2-for-1" standing offer for reviews.*
> This would give contributors an "official", acceptable way to get attention
> for their issue/PR, other than "begging" for attention on a mailing list.
> There are good ways for new contributors to be of significant help despite
> being new to the project, such as checking whether old bugs are still
> relevant, searching for duplicate issues, or applying old patches to the
> current code and creating a PR. (This would be similar to Martin v. Löwis's
> 5-for-1 offer in 2012[3], which had little success but lead to some
> interesting followup discussion[4]).
>
> 4. *Encourage core devs to dedicate some of their time to working through
> issues/PRs which are "ignored" or "stalled".* This would require first
> generating reliable lists of issues/PRs in such states. This could be in
> various forms, such as predefined GitHub/b.p.o. queries, a dedicated
> web-page, a periodic message similar to b.p.o.'s "weekly summary" email, or
> dedicated tags/labels for issues/PRs. (Perhaps prioritize "stalled" over
> "ignored".)
>
> - Tal Einat
>
>
> [1]: https://github.com/taleinat/python-contribution-feedback
> [2]:
> https://github.com/taleinat/python-contribution-feedback/blob/master/Takeaways%20-%20October%202020.md
> [3]:
> https://mail.python.org/archives/list/python-dev@python.org/message/7DLUN4Y7P77BSDW5YRWQQGVB3KVOY2M3/
> [4]:
> https://mail.python.org/archives/list/python-dev@py

[Python-Dev] Pickle for C extension?

2020-10-19 Thread Marco Sulla
TL;DR Is it possible to use C code to implement the (un)pickling of an type
written in a C extension, as it was written in _pickle.c?

Long explaining: I'm trying to create a C extension for frozendict. For
simplicity, first I wrote it in CPython, then I started to move it in a C
extension. It seems to work, but I have to move the code I wrote in
_pickle.c and pickle.py in the C extension. Is it possible, or I have to
create a slower `__reduce_ex__` method that simply converts it to dict?

This is, for example, the C code for pickling frozendict in _pickle.c:
https://github.com/Marco-Sulla/cpython/blob/41a640a947c36007e56bbc28f362c261110d2001/Modules/_pickle.c#L3370
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KJAC7STSXVD7SBK66HBMXIFPPX7SS5TO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Eryk Sun
On 10/19/20, Steve Dower  wrote:
> On 19Oct2020 1242, Steve Dower wrote:
>> On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:
>>> TLDR: In os.scandir directory entries, atime is always a copy of mtime
>>> rather than the actual access time.
>>
>> Correction - os.stat() updates the access time to _now_, while
>> os.scandir() returns the last access time without updating it.
>
> Let me correct myself first :)
>
> *Windows* has decided not to update file access time metadata *in
> directory entries* on reads. os.stat() always[1] looks at the file entry
> metadata, while os.scandir() always looks at the directory entry metadata.
>
> My suggested approach still applies, other than the bit where we might
> fix os.stat(). The best we can do is regress os.scandir() to have
> similarly poor performance, but the best *you* can do is use os.stat()
> for accurate timings when files might be being modified while your
> program is running, and don't do it when you just need names/kinds (and
> I'm okay adding that note to the docs).

If this is the correction to which you're referring in the previous
message, I assumed you stood by the claim that os.stat() may update
st_atime. That shouldn't be the case, so there shouldn't be anything
that needs to be fixed there, unless I'm missing what you think needs
to be fixed. If it's actually a problem, then I'd really, really like
a test case that reproduces it. If it was just a misinterpreted test
case or mis-remembered fact, then that's good news for me. ;-)

Regarding updating the access time in the directory entry, in my
previous reply I explained that NTFS should update it with a one-hour
granularity. With FAT, it's daily.

Regarding the view that this is only about "accurate timings when
files might be being modified while your program is running", in my
previous messages I stressed that the directory entry for a hard link
may have the wrong size, change time, write time, and access time if
it wasn't the last link used to update the file. That has nothing to
do with the file being modified while the program is running. It's a
stale directory entry. If you call os.stat() on the stale link, NTFS
will update it with the correct metadata.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SUGIZ6OAXOD37USVBWAW7JRSUDBSMG7Q/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Ideas for improving the contribution experience

2020-10-19 Thread Tal Einat
On Mon, Oct 19, 2020 at 9:24 PM Brett Cannon  wrote:

> I think the other way to help is to really lean into automation so
> reviewing is even lighterweight than it is now. Now this can be as simple
> as to remind people when they need to regenerate a file like 'configure'
> via a status check, simply telling people when their PR failed a status
> check, or go as far as try to automate PEP 7/8 stuff.
>
> Or another way to put it is to try and use automation to help make PRs
> more self-service so that there's less required from a human being to do.
>

A huge +1 from me!

Besides efficiency, I like that kind of automation because it makes our
checking of these things more consistent and official. Another reason is
that it allows contributors to be more independent in their work, rather
than waiting for someone else to explain what they have to do.

I've avoided mentioning code style until now since I remembered it being a
bit of a touchy subject a few years ago when I recall it coming up.
Personally I think that the benefits of being able to automate styling and
style checking far outweigh the negatives that were brought up at the time.
Brett, if you'd like to open that can of worms, I'd be willing to help.

 - Tal
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BWW76MVDUDC6MDZKU4HSJ4NFOPR3JY23/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Ideas for improving the contribution experience

2020-10-19 Thread Brett Cannon
On Mon., Oct. 19, 2020, 12:33 Tal Einat,  wrote:

> On Mon, Oct 19, 2020 at 9:24 PM Brett Cannon  wrote:
>
>> I think the other way to help is to really lean into automation so
>> reviewing is even lighterweight than it is now. Now this can be as simple
>> as to remind people when they need to regenerate a file like 'configure'
>> via a status check, simply telling people when their PR failed a status
>> check, or go as far as try to automate PEP 7/8 stuff.
>>
>> Or another way to put it is to try and use automation to help make PRs
>> more self-service so that there's less required from a human being to do.
>>
>
> A huge +1 from me!
>
> Besides efficiency, I like that kind of automation because it makes our
> checking of these things more consistent and official. Another reason is
> that it allows contributors to be more independent in their work, rather
> than waiting for someone else to explain what they have to do.
>
> I've avoided mentioning code style until now since I remembered it being a
> bit of a touchy subject a few years ago when I recall it coming up.
> Personally I think that the benefits of being able to automate styling and
> style checking far outweigh the negatives that were brought up at the time.
> Brett, if you'd like to open that can of worms, I'd be willing to help.
>


I have too many cans open right now to pop open another one. 😉

-Brett


>  - Tal
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SYREVH6WV7LA5JMS5XOA2KRHXF75AUHK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 11: Drop support for AIX releases without dlopen

2020-10-19 Thread Kevin Adler
Brett Cannon wrote:
> > AIX is not officially supported. We have tried to be helpful and add/remove
> things over the years related to AIX (we used to have an external
> contributor who actively tried to keep AIX supported), but we don't
> guarantee things work since there is no core dev available to try and keep
> AIX running.

Ok, so given that AIX is not officially supported, should PEP 11 be updated? The
change I made to drop dynload_aix may not be worth documenting there, but 
perhaps
dropping support AIX 5.3 and below (as stated in 
https://bugs.python.org/issue40680)
would be.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7AMVCHUFDKHNP7SNCTUW6KLWZRFIJC6F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 11: Drop support for AIX releases without dlopen

2020-10-19 Thread Brett Cannon
On Mon, Oct 19, 2020 at 2:06 PM Kevin Adler 
wrote:

> Brett Cannon wrote:
> > > AIX is not officially supported. We have tried to be helpful and
> add/remove
> > things over the years related to AIX (we used to have an external
> > contributor who actively tried to keep AIX supported), but we don't
> > guarantee things work since there is no core dev available to try and
> keep
> > AIX running.
>
> Ok, so given that AIX is not officially supported, should PEP 11 be
> updated?


Updated how? AIX is not mentioned in that PEP anywhere, so I'm not quite
sure what update you're suggesting.


> The
> change I made to drop dynload_aix may not be worth documenting there, but
> perhaps
> dropping support AIX 5.3 and below (as stated in
> https://bugs.python.org/issue40680)
> would be.
>

Possibly if IBM isn't supporting that version anymore (and based on your
email I would assume you would know 😉).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZDHU3HHNAUXURRAEGBNHXFQ4YDUHA5US/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Eryk Sun
On 10/19/20, Steve Dower  wrote:
>
> Resolving the path is the most expensive part, even if the file is not
> opened (I've been working with the NTFS team on this area, and we've
> been benchmarking/analysing all of it).

If you say it's been extensively benchmarked and there's no direct way
around the speed bottleneck, then I take your word for it. To clarify
what I had in mind, I was hoping that because NTFS implements the fast
I/O function FastIoQueryOpen [1] (via  NtfsNetworkOpenCreate, as given
by its FastIoDispatch table) that IRP_MJ_CREATE would be bypassed and
that the filesystem would not incur a significant cost to parse the
remaining path. I figured that most of the work would be in the
ObObjectObjectByName and IopParseDevice executive calls that lead up
to querying the filesystem.

Anyway, it's unfortunate that the Windows API doesn't support NT
handle-relative names, except in the registry API. If we could call
NTAPI NtQueryAttributesFile [2] directly, then the ObjectAttributes
argument could be relative to a directory handle set in the
RootDirectory field. That would eliminate the vast majority of the
path-resolution cost. A handle-relative open or query goes straight to
the filesystem device, which goes straight to the directory that
contains the file.

To eliminate the cost of opening the directory handle, scandir() could
be rewritten to use CreateFileW and GetFileInformationByHandleEx:
FileIdBothDirectoryInfo [3] instead of FindFirstFileW / FindNextFileW.
Just cache the directory handle in place of caching the find handle.
scandir() would gain fd support in Windows. Opening a directory via
os.open requires the flag _O_OBTAIN_DIR (0x2000), defined in fcntl.h.

FileIdBothDirectoryInfo provides the file ID, so the implementation
would support the inode() method without calling stat(). It would
still directly support is_dir() and is_file() based on the file
attributes, and is_symlink() based on the file attributes and the
EaSize field. The Windows Protocols document that the latter contains
the reparse tag for a reparse point. The field is reused because a
reparse point can't have extended attributes.

All that said, I don't prefer to call NtQueryAttributesFile or any
other NTAPI function in Windows Python. I'd rather do the best
possible with just the Windows API. I wish there were a new
GetFileAttributesExExW function that supported handle-relative names.
Even better would be a new function that calls
NtQueryInformationByName -- something like GetFileInformationByName --
for FileStatInfo (and FileCaseSensitiveInfo as well, which is becoming
more of an issue), also with support for handle-relative names.

[1] 
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/ns-wdm-_fast_io_dispatch
[2] 
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-zwqueryfullattributesfile
[3] 
https://docs.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_id_both_dir_info
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GODUIB5WKVZLX4BVPEM2NS37JFHUXIID/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Greg Ewing

On 20/10/20 4:52 am, Gregory P. Smith wrote:
Those of us with a traditional posix filesystem background may raise 
eyeballs at this duplication, seeing a directory as a place that merely 
maps names to inodes


This is probably a holdover from MS-DOS, where there was no separate
inode-like structure -- it was all in the directory entry.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QJVZ2EXFKCMZ4YHERFI2FXJTWWPFCFSA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 640: Unused variable syntax.

2020-10-19 Thread Thomas Wouters
One of the problems I have with the Pattern Matching proposal (PEP 622
originally, now PEPs 634, 635, 636) is the special-casing of '_' to not
actually assign to the name, which is a subtle but meaningful divergence
from the rest of Python. In discussions with the authors I proposed using
'?' instead *and* extending that to existing unpacking syntax. The PEP
authors were understandably a little hesitant to tack that onto their
already quite extensive proposal, so I suggested making it a
separate proposal. That proposal is at
https://www.python.org/dev/peps/pep-0640 (unless that hasn't updated yet,
in which case it's at
https://github.com/python/peps/blob/master/pep-0640.rst), and also included
below.

This proposal doesn't necessarily require pattern matching to be accepted
-- the new syntax stands well enough on its own -- but I'm recommending
this *not* be accepted if pattern matching using the same syntax is not
also accepted. The benefit without pattern matching is real but small, and
in my opinion it's not worth the added complexity.

The PEP:

PEP: 640
Title: Unused variable syntax
Author: Thomas Wouters 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 04-Oct-2020
Python-Version: 3.10
Post-History:

Abstract


This PEP proposes new syntax for *unused variables*, providing a pseudo-name
that can be assigned to but not otherwise used. The assignment doesn't
actually happen, and the value is discarded instead.

Motivation
==

In Python it is somewhat common to need to do an assignment without actually
needing the result. Conventionally, people use either ``"_"`` or a name such
as ``"unused"`` (or with ``"unused"`` as a prefix) for this. It's most
common in *unpacking assignments*::

   x, unused, z = range(3)
   x, *unused, z = range(10)

It's also used in ``for`` loops and comprehensions::

   for unused in range(10): ...
   [ SpamObject() for unused in range(10) ]

The use of ``"_"`` in these cases is probably the most common, but it
potentially conflicts with the use of ``"_"`` in internationalization, where
a call like gettext.gettext() is bound to ``"_"`` and used to mark strings
for translation.

In the proposal to add Pattern Matching to Python (originally PEP 622, now
split into PEP 634, PEP 635 and PEP 636), ``"_"`` has an *additional*
special meaning. It is a wildcard pattern, used in places where variables
could be assigned to, to indicate anything should be matched but not
assigned to anything. The choice of ``"_"`` there matches the use of ``"_"``
in other languages, but the semantic difference with ``"_"`` elsewhere in
Python is significant.

This PEP proposes to allow a special token, ``?``, to be used instead. This
has most of the benefits of ``"_"`` without affecting other uses of that
otherwise regular variable. Allowing the use of the same wildcard pattern
would make pattern matching and unpacking assignment more consistent with
each other.

Rationale
=

Marking certain variables as unused is a useful tool, as it helps clarity of
purpose of the code. It makes it obvious to readers of the code as well as
automated linters, that a particular variable is *intentionally* unused.

However, despite the convention, ``"_"`` is not a special variable. The
value is still assigned to, the object it refers to is still kept alive
until the end of the scope, and it can still be used. Nor is the use of
``"_"`` for unused variables entirely ubiquitous, since it conflicts with
conventional internationalization, it isn't obvious that it is a regular
variable, and it isn't as obviously unused like a variable named
``"unused"``.

In the Pattern Matching proposal, the use of ``"_"`` for wildcard patterns
side-steps the problems of ``"_"`` for unused variables by virtue of it
being in a separate scope. The only conflict it has with
internationalization is one of potential confusion, it will not actually
interact with uses of a global variable called ``"_"``. However, the
special-casing of ``"_"`` for this wildcard pattern purpose is still
problematic: the different semantics *and meaning* of ``"_"`` inside pattern
matching and outside of it means a break in consistency in Python.

Introducing ``?`` as special syntax for unused variables *both inside and
outside pattern matching* allows us to retain that consistency. It avoids
the conflict with internationalization *or any other uses of _ as a
variable*. It makes unpacking assignment align more closely with pattern
matching, making it easier to explain pattern matching as an extension of
unpacking assignment.

In terms of code readability, using a special token makes it easier to find
out what it means (``"what does question mark in Python do"`` versus ``"why
is my _ variable not getting assigned to"``), and makes it more obvious that
the actual intent is for the value to be unused -- since it is entirely
impossible to use it.

Specification
=

A new token is introduced, ``"?"``, or ``token.QMARK``.

[Python-Dev] Re: PEP 11: Drop support for AIX releases without dlopen

2020-10-19 Thread Kevin Adler
Brett Cannon wrote:
> > Updated how? AIX is not mentioned in that PEP anywhere, so I'm not quite
> sure what update you're suggesting.

I'm referring to 
https://www.python.org/dev/peps/pep-0011/#unsupporting-platforms.

"If a certain platform that currently has special code in CPython is deemed to 
be without
enough Python users or lacks proper support from the Python development team 
and/or
a buildbot, a note must be posted in this PEP that this platform is no longer 
actively supported."

https://www.python.org/dev/peps/pep-0011/#no-longer-supported-platforms

Should this list be updated to mention that AIX 5.3 and below are no longer 
supported?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DPEE7GDHLL5ZUY6UZTLIRPGNLB4FWKLZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 638: Syntactic macros

2020-10-19 Thread Dima Tisnek
Martin sent good links, I'll just add a practical example:


Without macros, styled-components works like this:

import styled from "styled-components";

const Label = styled.div`
  color: red;
`;

(that's a js template literal, next level after f-strings, here
without any arguments)
When e.g. Hi is rendered, the result is Hi where class name is a hash of the inline css,
and css is injected into the document head.

This is great for many reasons, except developer often ends up with
the below on the page (dev tools, though page source is possible too)
where it's quite hard to mentally work back which styled.div was is
that generated hash 7676...

  

  ...


Here's the same with babel macros:

import styled from "styled-components/macro";

const Label = styled.div`
  color: red
`;

When Hi is rendered, the result is Hi where class name encodes hash of
the inline css, but also the file name where it was "defined" as well
as the name of the variable to which it was assigned, changing the div
tree e.g. to:

  

  ...


How it works:
* babel macros work on AST, not text
* babel macro has access to entire module AST, and can thus infer and
modify the module:
  * here, module name is recorded, and if the result of macro call is
assigned, then target variable name is recorded
  * I've seen automatic import injection (useful to pass glocal state
in e.g. localisation libraries)
  * almost anything is possible

There are downsides:
1. the macro code is more involved, e.g. see
https://github.com/styled-components/styled-components/blob/master/packages/styled-components/src/macro/index.js
2. multiple macros can and do collide on occasion (usually when
written naively), which somewhat limits composability by end users

On Sat, 17 Oct 2020 at 07:18, Guido van Rossum  wrote:
>
> Dima,
>
> Do you have a link to "babel macros"? Searching for that brought up several 
> different things; not being a frequent JS user I don't know how to filter 
> these.
>
> --Guido
>
> On Wed, Oct 14, 2020 at 11:55 PM Dima Tisnek  wrote:
>>
>> My 2c as a Python user (mostly) and someone who dabbled in ES2020:
>>
>> The shouting syntax! does not sit well with me.
>> The $hygenic is also cumbersome.
>>
>> To contrast, babel macros:
>> * looks like regular code, without special syntax: existing tooling
>> works, less mental strain
>> * have access to call site environment, so not strictly hygienic(?):
>> allow for greater expressive power
>>
>> I these the two points above really helped adopt babel macros in the
>> js community and should, at the very least be seriously considered by
>> the py community.
>>
>> Cheers,
>> d.
>>
>> On Sat, 26 Sep 2020 at 21:16, Mark Shannon  wrote:
>> >
>> > Hi everyone,
>> >
>> > I've submitted my PEP on syntactic macros as PEP 638.
>> > https://www.python.org/dev/peps/pep-0638/
>> >
>> > All comments and suggestions are welcome.
>> >
>> > Cheers,
>> > Mark
>> > ___
>> > Python-Dev mailing list -- python-dev@python.org
>> > To unsubscribe send an email to python-dev-le...@python.org
>> > https://mail.python.org/mailman3/lists/python-dev.python.org/
>> > Message archived at 
>> > https://mail.python.org/archives/list/python-dev@python.org/message/U4C4XHNRC4SHS3TPZWCTY4SN4QU3TT6V/
>> > Code of Conduct: http://python.org/psf/codeofconduct/
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at 
>> https://mail.python.org/archives/list/python-dev@python.org/message/VEC7VWY5TJJGBXWFQUX3XO43SQAZ7FMR/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> Pronouns: he/him (why is my pronoun here?)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/56CWO3OM52CM6ANOOIPFXWQVGL75C4JK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: os.scandir bug in Windows?

2020-10-19 Thread Rob Cliffe via Python-Dev



On 19/10/2020 12:42, Steve Dower wrote:

On 15Oct2020 2239, Rob Cliffe via Python-Dev wrote:
TLDR: In os.scandir directory entries, atime is always a copy of 
mtime rather than the actual access time.


Correction - os.stat() updates the access time to _now_, while 
os.scandir() returns the last access time without updating it.


Eryk replied with a deeper explanation of the cause, but fundamentally 
this is what you are seeing.


Feel free to file a bug, but we'll likely only add a vague note to the 
docs about how Windows works here rather than changing anything. If 
anything, we should probably fix os.stat() to avoid updating the 
access time so that both functions behave the same, but that might be 
too complicated.


Cheers,
Steve

Sorry - what you say does not match the behaviour I observe, which is that
    (1) Neither os.stat, nor reading os.scandir directory entries, 
update any of the times on disk.
    (2) os.stat.st_atime returns the "correct" time the file was last 
accessed.

    (3) os.scandir always returns st.atime equal to st.mtime.

Modified demo program:

# osscandirtest.py
import time, os

print(f'[1] {time.time()=}')
with open('Test', 'w') as f: f.write('Anything\n')

time.sleep(20)

print(f'[2] {time.time()=}')
with open('Test', 'r') as f: f.readline() # Read the file

time.sleep(10)

print(f'[3] {time.time()=}')
print(os.stat('Test'))
for DirEntry in os.scandir('.'):
    if DirEntry.name == 'Test':
    stat = DirEntry.stat()
    print(f'scandir DirEntry {stat.st_ctime=} {stat.st_mtime=} 
{stat.st_atime=}')

print(os.stat('Test'))
for DirEntry in os.scandir('.'):
    if DirEntry.name == 'Test':
    stat = DirEntry.stat()
    print(f'scandir DirEntry {stat.st_ctime=} {stat.st_mtime=} 
{stat.st_atime=}')

print(f'[4] {time.time()=}')

Sample output:

[1] time.time()=1603166161.12121
[2] time.time()=1603166181.1306772
[3] time.time()=1603166191.1426473
os.stat_result(st_mode=33206, st_ino=9851624184951253, 
st_dev=2230120362, st_nlink=1, st_uid=0, st_gid=0, st_size=10,

st_atime=1603166181, st_mtime=1603166161, st_ctime=1603166161)
scandir DirEntry stat.st_ctime=1603166161.12121 
stat.st_mtime=1603166161.12121 stat.st_atime=1603166161.12121
os.stat_result(st_mode=33206, st_ino=9851624184951253, 
st_dev=2230120362, st_nlink=1, st_uid=0, st_gid=0, st_size=10,

st_atime=1603166181, st_mtime=1603166161, st_ctime=1603166161)
scandir DirEntry stat.st_ctime=1603166161.12121 
stat.st_mtime=1603166161.12121 stat.st_atime=1603166161.12121

[4] time.time()=1603166191.1426473

You will observe that
    (1) The results from the two os.stat calls are the same, as are the 
results from the two scandir calls.
    (2) The os.stat.st_atime (1603166181) *IS* the time that the file 
was read with the

            with open('Test', 'r') as f: f.readline() # Read the file
        line of code, as it matches the
            [2] time.time()=1603166181.1306772
        line of output (apart from discarded fractions of a second) and 
is 20 seconds (*not* 30 seconds) after the file creation time, as expected.
    (3) The os.scandir atime is a copy of mtime (and in this case, of 
ctime as well).


So it really does seem that the only thing "wrong" is that os.scandir 
returns atime as a copy of mtime, rather than the correct value.
And since os.stat returns the "right" answer and os.scandir doesn't, it 
really seems that this is a bug, or at least a deficiency, in os.scandir.


Demo run on Windows 10 Home version 1903 OS build 18362.1139
Python version 3.8.3 (32-bit).
Best wishes
Rob Cliffe
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MGICSKCSTSKS36XUP6IZTXZOSGBPMQYY/
Code of Conduct: http://python.org/psf/codeofconduct/