[issue38894] Path.glob() sometimes misses files that match

2020-10-29 Thread daniel hahler


Change by daniel hahler :


--
nosy: +blueyed
nosy_count: 4.0 -> 5.0
pull_requests: +21943
pull_request: https://github.com/python/cpython/pull/23025

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-07 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-07 Thread miss-islington


miss-islington  added the comment:


New changeset 928b4dd0edf0022190a8a296c8ea65e7ef55c694 by Miss Islington (bot) 
in branch '3.8':
bpo-38894: Fix pathlib.Path.glob in the presence of symlinks and insufficient 
permissions (GH-18815)
https://github.com/python/cpython/commit/928b4dd0edf0022190a8a296c8ea65e7ef55c694


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-07 Thread miss-islington


miss-islington  added the comment:


New changeset cca0b31fb8ed7d25ede68f314d4a85bb07d6ca6f by Miss Islington (bot) 
in branch '3.7':
bpo-38894: Fix pathlib.Path.glob in the presence of symlinks and insufficient 
permissions (GH-18815)
https://github.com/python/cpython/commit/cca0b31fb8ed7d25ede68f314d4a85bb07d6ca6f


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-07 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:


New changeset eb7560a73d46800e4ade4a8869139b48e6c92811 by Pablo Galindo in 
branch 'master':
bpo-38894: Fix pathlib.Path.glob in the presence of symlinks and insufficient 
permissions (GH-18815)
https://github.com/python/cpython/commit/eb7560a73d46800e4ade4a8869139b48e6c92811


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-07 Thread miss-islington


Change by miss-islington :


--
pull_requests: +18188
pull_request: https://github.com/python/cpython/pull/18831

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-07 Thread miss-islington


Change by miss-islington :


--
nosy: +miss-islington
nosy_count: 3.0 -> 4.0
pull_requests: +18187
pull_request: https://github.com/python/cpython/pull/18830

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-06 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
keywords: +patch
pull_requests: +18173
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/18815

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-06 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Ok, I managed to reproduce. 

This seems a regression introduced by 
https://github.com/python/cpython/pull/11988 in issue 
https://bugs.python.org/issue36035.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-06 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
Removed message: https://bugs.python.org/msg363548

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-06 Thread Pablo Galindo Salgado

Pablo Galindo Salgado  added the comment:

I fail to reproduce this last example:

~/github/python/master master*
❯ mkdir tmp

~/github/python/master master*
❯ cd tmp 

~/github/python/master/tmp master*
❯ touch 1.txt

~/github/python/master/tmp master*
❯ ln -s subdir/file 2.txt   

~/github/python/master/tmp master*
❯ touch 3.txt

~/github/python/master/tmp master*
❯ python3.8 -c "import pathlib; print(list(pathlib.Path('.').glob('*')))" 

[PosixPath('1.txt'), PosixPath('3.txt'), PosixPath('2.txt')]

~/github/python/master/tmp master*
❯ mkdir subdir

~/github/python/master/tmp master*
❯ python3.8 -c "import pathlib; print(list(pathlib.Path('.').glob('*')))"  
[PosixPath('1.txt'), PosixPath('subdir'), PosixPath('3.txt'), 
PosixPath('2.txt')]

~/github/python/master/tmp master*
❯ chmod 000 subdir

~/github/python/master/tmp master*
❯ python3.8 -c "import pathlib; print(list(pathlib.Path('.').glob('*')))"
[PosixPath('1.txt'), PosixPath('subdir'), PosixPath('3.txt')]

--
keywords: +3.7regression, 3.8regression
versions: +Python 3.7, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-06 Thread Matt Wozniski


Matt Wozniski  added the comment:

A simple test case for this issue:

~>mkdir tmp
~>cd tmp
tmp>touch 1.txt
tmp>ln -s subdir/file 2.txt
tmp>touch 3.txt
tmp>ls -l
total 0
-rw-rw-r-- 1 mwoznisk general  0 Mar  6 14:52 1.txt
lrwxrwxrwx 1 mwoznisk general 11 Mar  6 14:52 2.txt -> subdir/file
-rw-rw-r-- 1 mwoznisk general  0 Mar  6 14:52 3.txt
tmp>python3.8 -c "import pathlib; print(list(pathlib.Path('.').glob('*')))"
[PosixPath('1.txt'), PosixPath('2.txt'), PosixPath('3.txt')]
tmp>mkdir subdir
tmp>python3.8 -c "import pathlib; print(list(pathlib.Path('.').glob('*')))"
[PosixPath('1.txt'), PosixPath('2.txt'), PosixPath('3.txt'), 
PosixPath('subdir')]

So far so good, but if the subdirectory isn't readable, things fall apart:

tmp>chmod 000 subdir
tmp>python3.8 -c "import pathlib; print(list(pathlib.Path('.').glob('*')))"
[PosixPath('1.txt')]

Looks like this is caused by entry.is_dir() in pathlib._WildcardSelector 
raising a PermissionError when trying to check if a symlink pointing into an 
unreadable directory is or isn't a directory.  EACCESS isn't in IGNORED_ERROS 
(sic) and so the loop over directory entries is broken out of, and the "except 
PermissionError:" block in _select_from swallows the exception so that the 
failure is silent.

--
nosy: +Matt Wozniski
versions: +Python 3.8 -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2020-03-06 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
assignee:  -> pablogsal
nosy: +pablogsal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2019-11-22 Thread Thierry Parmentelat


Thierry Parmentelat  added the comment:

to clarify, when I said 'lambda user' I mean regular, non-root user that has no 
permission to read in /root

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38894] Path.glob() sometimes misses files that match

2019-11-22 Thread Thierry Parmentelat


New submission from Thierry Parmentelat :

I have observed this on a linux box running fedora29

$ python3 --version
Python 3.7.5
$ uname -a
Linux faraday.inria.fr 5.3.11-100.fc29.x86_64 #1 SMP Tue Nov 12 20:41:25 UTC 
2019 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/fedora-release
Fedora release 29 (Twenty Nine)

 steps to reproduce:

This assumes that /root is not readable by lambda users

- as root:

# mkdir /tmp/foo
# cd /tmp/foo
# touch a b d e
# ln -s /root/anywhere c

# ls -l
total 0
-rw-r--r-- 1 root root  0 Nov 22 14:51 a
-rw-r--r-- 1 root root  0 Nov 22 14:51 b
lrwxrwxrwx 1 root root 14 Nov 22 14:53 c -> /root/anywhere
-rw-r--r-- 1 root root  0 Nov 22 14:51 d
-rw-r--r-- 1 root root  0 Nov 22 14:51 e


- as a lambda user:

we can see all files

$ ls -l /tmp/foo
total 0
-rw-r--r-- 1 root root  0 Nov 22 14:51 a
-rw-r--r-- 1 root root  0 Nov 22 14:51 b
lrwxrwxrwx 1 root root 14 Nov 22 14:53 c -> /root/anywhere
-rw-r--r-- 1 root root  0 Nov 22 14:51 d
-rw-r--r-- 1 root root  0 Nov 22 14:51 e

and with glob.glob() too

In [1]: import glob

In [2]: for filename in glob.glob("/tmp/foo/*"):
   ...: print(filename)
   ...:
/tmp/foo/c
/tmp/foo/e
/tmp/foo/d
/tmp/foo/b
/tmp/foo/a


BUT Path.glob() is not working as expected

In [3]: from pathlib import Path

In [4]: for filename in Path("/tmp/foo/").glob("*"):
   ...: print(filename)
   ...:



- If I now I go back as root and remove the problematic file in /tmp/foo

# rm /tmp/foo/c


- and try again as a lambda user

In [5]: for filename in Path("/tmp/foo/").glob("*"):
   ...: print(filename)
   ...:
/tmp/foo/e
/tmp/foo/d
/tmp/foo/b
/tmp/foo/a


 discussion

in my case in a real application I was getting *some* files - not an empty list 
like here. 

I ran strace on that real application
it's fairly clear from that output that the odd symlink is causing the scanning 
of all files to break instead of continuing (see snip below)
of course the order in which files are read from the disk will impact the 
behaviour, that's why I created the symlink last, that might need to be changed 
to reproduce successfully in another setup



 strace extract


getdents64(3, /* 189 entries */, 32768) = 8640
getdents64(3, /* 0 entries */, 32768)   = 0
close(3)= 0
stat("/var/lib/rhubarbe-images/centos.ndz", {st_mode=S_IFREG|0644, 
st_size=1002438656, ...}) = 0
stat("/var/lib/rhubarbe-images/oai-enb.ndz", {st_mode=S_IFREG|0644, 
st_size=2840592384, ...}) = 0

stat("/var/lib/rhubarbe-images/ubuntu-floodlight.ndz", {st_mode=S_IFREG|0644, 
st_size=2559574016, ...}) = 0
stat("/var/lib/rhubarbe-images/ndnsim.ndz", {st_mode=S_IFREG|0644, 
st_size=4153409536, ...}) = 0

==> that's the line about the broken symlink in my real app
stat("/var/lib/rhubarbe-images/push-to-preplab.sh", 0x7ffd3ac4a140) = -1 EACCES 
(Permission denied)
==> and here it stops scanning files while there are still quite a lot to be 
dealt with

write(1, "/var/lib/rhubarbe-images/fedora-"..., 
82/var/lib/rhubarbe-images/fedora-31.ndz
/var/lib/rhubarbe-images/fedora-31-ssh.ndz
) = 82
rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, 
sa_restorer=0x7fc583705e70}, {sa_handler=0x7fc583936f10, sa_mask=[], \
sa_flags=SA_RESTORER, sa_restorer=0x7fc583705e70}, 8) = 0
sigaltstack(NULL, {ss_sp=0x560a7dac3330, ss_flags=0, ss_size=16384}) = 0
sigaltstack({ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=0}, NULL) = 0
exit_group(0)   = ?
+++ exited with 0 +++

--
messages: 357284
nosy: thierry.parmentelat
priority: normal
severity: normal
status: open
title: Path.glob() sometimes misses files that match
type: behavior
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com