[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2019-11-09 Thread Bruno P. Kinoshita
Change by Bruno P. Kinoshita : -- pull_requests: +16605 pull_request: https://github.com/python/cpython/pull/17098 ___ Python tracker ___

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2019-02-23 Thread Giampaolo Rodola'
Change by Giampaolo Rodola' : -- pull_requests: +12025 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2019-02-23 Thread flokX
Change by flokX : -- pull_requests: +12022 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-11-12 Thread Giampaolo Rodola'
Change by Giampaolo Rodola' : -- assignee: -> giampaolo.rodola resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-11-12 Thread Giampaolo Rodola'
Giampaolo Rodola' added the comment: New changeset 19c46a4c96553b2a8390bf8a0e138f2b23e28ed6 by Giampaolo Rodola in branch 'master': bpo-33695 shutil.copytree() + os.scandir() cache (#7874) https://github.com/python/cpython/commit/19c46a4c96553b2a8390bf8a0e138f2b23e28ed6 --

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-10-22 Thread Giampaolo Rodola'
Giampaolo Rodola' added the comment: @Serhiy: I would like to proceed with this. Do you have further comments? Do you prefer to bring this up on python-dev for further discussion? -- ___ Python tracker

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-08-02 Thread Yury Selivanov
Yury Selivanov added the comment: > Depending on the configuration, stat() in a network filesystem can be between > very slow and slow. +1. I also quickly glanced over the patch and I think it looks like a clear win. -- ___ Python tracker

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-08-02 Thread Giampaolo Rodola'
Giampaolo Rodola' added the comment: Yes, file copy (open() + read() + write()) is of course more expensive than just "reading" a tree (os.walk(), glob()) or deleting it (rmtree()) and the "pure file copy" time adds up to the benchmark. And indeed it's not an coincidence that #33671 (which

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-08-02 Thread STINNER Victor
STINNER Victor added the comment: When I worked on the os.scandir() implementation, I recall that an interesting test was NFS. Depending on the configuration, stat() in a network filesystem can be between very slow and slow. -- ___ Python tracker

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-08-01 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: os.walk() and glob.glob() used *only* stat(), opendir() and readdir() syscalls (and stat() syscalls dominated). The effect of reducing the number of the stat() syscalls is significant. shutil.rmtree() uses also the unlink() syscall. Since it is usually

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-08-01 Thread Giampaolo Rodola'
Giampaolo Rodola' added the comment: I agree the provided benchmark on Linux should be more refined. And I'm not sure if "echo 3 | sudo tee /proc/sys/vm/drop_caches" before running it is enough honestly. The main point here is the reduction of stat() syscalls (-38%) and that can make a

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-08-01 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: For dropping disc caches on Linux run with open('/proc/sys/vm/drop_caches', 'ab') as f: f.write(b'3\n') before every test. -- ___ Python tracker

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-08-01 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I'm not convinced that this change should be merged. The benefit is small, and 1) it is only for an artificial set of tiny files, 2) the benchmarking ignores the real IO, it measures the work with a cache. When copy real files (/usr/include or Lib/) with

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-07-17 Thread Giampaolo Rodola'
Giampaolo Rodola' added the comment: Unless somebody has complaints I think I'm gonna merge this soon. -- ___ Python tracker ___

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-06-23 Thread Giampaolo Rodola'
Giampaolo Rodola' added the comment: > I re-ran benchmarks since shutil code changed after #33695. Sorry, I meant #33671. -- ___ Python tracker ___

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-06-23 Thread Giampaolo Rodola'
Giampaolo Rodola' added the comment: PR at: https://github.com/python/cpython/pull/7874. I re-ran benchmarks since shutil code changed after #33695. Linux went from +13.5% to 8.8% and Windows went from +17% to 20.7%. In the PR I explicitly avoided using a context manager around os.scandir()

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-06-23 Thread Giampaolo Rodola'
Change by Giampaolo Rodola' : -- pull_requests: +7481 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-05-30 Thread Giampaolo Rodola'
Change by Giampaolo Rodola' : -- nosy: +benhoyt, benjamin.peterson, brett.cannon, ncoghlan, serhiy.storchaka, stutzbach, tarek, vstinner, yselivanov ___ Python tracker ___

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-05-30 Thread Giampaolo Rodola'
Change by Giampaolo Rodola' : -- keywords: +patch Added file: https://bugs.python.org/file47625/bpo-33695.patch ___ Python tracker ___

[issue33695] Have shutil.copytree(), copy() and copystat() use cached scandir() stat()s

2018-05-30 Thread Giampaolo Rodola'
New submission from Giampaolo Rodola' : Patch in attachment makes shutil.copytree() use os.scandir() and (differently from #33414) DirEntry instances are passed around so that cached stat()s are used also from within copy2() and copystat() functions. The number of times the filesystem gets