[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-26 Thread Giampaolo Rodola'


Change by Giampaolo Rodola' :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-26 Thread Giampaolo Rodola'


Giampaolo Rodola'  added the comment:


New changeset 65c92c5870944b972a879031abd4c20c4f0d7981 by Giampaolo Rodola 
(Bruno P. Kinoshita) in branch '3.8':
[3.8] bpo-38688, shutil.copytree: consume iterator and create list of entries 
to prevent infinite recursion (GH-17397)
https://github.com/python/cpython/commit/65c92c5870944b972a879031abd4c20c4f0d7981


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-26 Thread Bruno P. Kinoshita


Change by Bruno P. Kinoshita :


--
pull_requests: +16878
stage: commit review -> patch review
pull_request: https://github.com/python/cpython/pull/17397

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-26 Thread Giampaolo Rodola'


Change by Giampaolo Rodola' :


--
stage: patch review -> commit review
versions: +Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-26 Thread Giampaolo Rodola'


Giampaolo Rodola'  added the comment:


New changeset 9bbcbc9f6dfe1368fe7330b117707f828e6a2c18 by Giampaolo Rodola 
(Bruno P. Kinoshita) in branch 'master':
bpo-38688, shutil.copytree: consume iterator and create list of entries to 
prevent infinite recursion (GH-17098)
https://github.com/python/cpython/commit/9bbcbc9f6dfe1368fe7330b117707f828e6a2c18


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-21 Thread Bruno P. Kinoshita


Bruno P. Kinoshita  added the comment:

Done. Rebased on master too, and edited commit message & GH PR title. Thanks 
Giampaolo!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-21 Thread Giampaolo Rodola'


Giampaolo Rodola'  added the comment:

PR-17098 as it stands re-introduces some stat() syscall. I suggest to just 
consume the iterator: it's a small change and it should fix the issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-21 Thread Bruno P. Kinoshita


Bruno P. Kinoshita  added the comment:

I really liked that improvement, and didn't think it needed to be removed. 
That's why the PR reverts it partially. I think the os.stat improvements were 
in the other methods changed, and should not be changed in my PR - unless I 
changed it by accident.

So with the current PR for this issue, or with your suggested patch, both 
should have similar performance I think. (I hadn't seen that script to measure 
performance, thanks.)

I am happy if we revert partially, or if we build the fix on top of the current 
code consuming the iterator. Should I update the PR with your suggested fix 
then?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-21 Thread Giampaolo Rodola'


Giampaolo Rodola'  added the comment:

The speedup introduced in issue33695 is mostly because the number of os.stat() 
syscall was reduced from 6 to 1 per file (both by using scandir() and because 
stat() results are cached and passed around between function calls). As such, 
even if we immediately consume scandir() iterator I believe it won't have a 
meaningful impact in terms of speed. FWIW issue33695 has a benchmark script 
attached (but it's not very stable).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-21 Thread Bruno P. Kinoshita


Bruno P. Kinoshita  added the comment:

Hi Giampaolo,

I think it is more or less the same as the previous code, which was using 
os.list to return a list in memory. My first tentative fix was:

def copytree(src, ...):
entries = os.list(src)
return _copytree(entries=entries, ...)

But the previous PR also altered _copytree to use the return of os.scandir 
DirEntry's, so the change above results in AttributeError: 'str' object has no 
attribute 'name'.

Would be better to avoid using iterator to populate a list, and also using the 
DirEntry in _copytree, and just stick with the previous code with (i.e. 
os.listdir() and a single copytree method as before)? Or if you think we should 
go with your suggestion, I'm good with it as well - especially as it'd be a 
much simpler PR :)

Thanks
Bruno

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-20 Thread Giampaolo Rodola'


Giampaolo Rodola'  added the comment:

PR-17098 basically reverts https://bugs.python.org/issue33695. Not good. =) 
I think we can simply consume the iterator immediately as in:

def copytree(src, ...):
with os.scandir(src) as itr:
entries = list(itr)
return _copytree(entries=entries, ...)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-09 Thread Bruno P. Kinoshita


Bruno P. Kinoshita  added the comment:

Hi,

I did a quick `git bisect` using the example provided, and looks like this 
regression was added in the fix for bpo-33695, commit 
19c46a4c96553b2a8390bf8a0e138f2b23e28ed6.

It looks to me that the iterator returned by with os.scandir(...) is including 
the newly created dst directory (see the call for `os.makedirs(dst, 
exist_ok=dirs_exist_ok)` in 
https://github.com/python/cpython/blob/e27449da92b13730a5e11182f329d5da98a5e05b/Lib/shutil.py#L449).

This results in the function finding an extra directory, and repeating the 
steps for this folder and its subfolder recursively. This only happens because 
in the example in this issue, dst is a subdirectory of src.

The bpo-33695 commit had more changes, so I've reverted just this block of the 
copytree as a tentative fix, and added a unit test: 
https://github.com/python/cpython/pull/17098

--

Here's a simplified version of what's going on:

```
import os
import shutil

shutil.rmtree('/tmp/test/', True)
os.makedirs('/tmp/test')
with open('/tmp/test/foo', 'w+') as f:
  f.write('foo')

# now we have /tmp/test/foo, let's simulate what happens in copytree on master

with os.scandir('/tmp/test') as entries:
  # up to this point, /tmp/test/foo is the only entry
  os.makedirs('/tmp/test/1/2/3/', exist_ok=True) # < when the iterator 
starts below in `f in entries`, it will find 1 too
  # now 1 will have been added too
  for f in entries:
print(f)
```

--
nosy: +kinow

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-09 Thread Bruno P. Kinoshita


Change by Bruno P. Kinoshita :


--
keywords: +patch
pull_requests: +16604
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/17098

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-08 Thread Terry J. Reedy


Change by Terry J. Reedy :


--
keywords: +3.8regression
stage:  -> needs patch
type: crash -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-04 Thread STINNER Victor


Change by STINNER Victor :


--
nosy: +giampaolo.rodola

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38688] Python 3.8 regression: endless loop in shutil.copytree

2019-11-04 Thread Christian Boltz


New submission from Christian Boltz :

The following test script works with Python 3.7 (and older), but triggers an 
endless loop with Python 3.8:


#!/usr/bin/python3

import shutil
import os

os.mkdir('/dev/shm/t')
os.mkdir('/dev/shm/t/pg')

with open('/dev/shm/t/pg/pol', 'w+') as f:
f.write('pol')

shutil.copytree('/dev/shm/t/pg', '/dev/shm/t/pg/somevendor/1.0')


The important point is probably that 'pg' gets copied into a subdirectory of 
itsself. While this worked in Python up to 3.7, doing the same in Python 3.8 
runs into an endless loop:

# python3 /home/abuild/rpmbuild/SOURCES/test.py
Traceback (most recent call last):
  File "/home/abuild/rpmbuild/SOURCES/test.py", line 15, in  
shutil.copytree('/dev/shm/t/pg', '/dev/shm/t/pg/somevendor/1.0')
  File "/usr/lib/python3.8/shutil.py", line 547, in copytree 
return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
  File "/usr/lib/python3.8/shutil.py", line 486, in _copytree
copytree(srcobj, dstname, symlinks, ignore, copy_function,
...
copytree(srcobj, dstname, symlinks, ignore, copy_function,
  File "/usr/lib/python3.8/shutil.py", line 547, in copytree 
return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
  File "/usr/lib/python3.8/shutil.py", line 449, in _copytree
os.makedirs(dst, exist_ok=dirs_exist_ok)
  File "/usr/lib/python3.8/os.py", line 206, in makedirs 
head, tail = path.split(name)
  File "/usr/lib/python3.8/posixpath.py", line 104, in split 
sep = _get_sep(p)
  File "/usr/lib/python3.8/posixpath.py", line 42, in _get_sep 
if isinstance(path, bytes):
RecursionError: maximum recursion depth exceeded while calling a Python object

I also reported this at https://bugzilla.opensuse.org/show_bug.cgi?id=1155839

--
components: Library (Lib)
messages: 355981
nosy: cboltz
priority: normal
severity: normal
status: open
title: Python 3.8 regression: endless loop in shutil.copytree
type: crash
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com