New submission from Niklas Hambüchen <n...@deditus.de>:

See http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a 
for the explanation and equivalent fix in coreutils.

The gist ist that deleting entries in inode order can improve deletion 
performance dramatically.

To obtain inode numbers and sort by them, one needs to `getdents()` all the 
entries ahead of time, but rmtree() already gets all dirents ahead of the 
deletion. https://bugs.python.org/issue28564 recently improved shutil.rmtree() 
performance by using scandir(), but nevertheless the returned generator is 
list()ed, so we already have all necessary informtaion in memory and would just 
have to perform an O(n) integer sort.

I propose we check if the improvements made to `rm -r` in coreutils should be 
ported to shutil.rmtree().

----------
components: Library (Lib)
messages: 309217
nosy: nh2
priority: normal
severity: normal
status: open
title: shutil.rmtree can have O(n^2) performance on large dirs
type: enhancement
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32453>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to