[issue15200] Faster os.walk

Larry Hastings Wed, 27 Jun 2012 03:31:26 -0700

Larry Hastings <la...@hastings.org> added the comment:

> It doesn't have to.
> Right now, it uses O(depth of the directory tree) FDs. 
> It can be changed to only require O(1) FDs


But closing and reopening those file descriptors seems like it might slow it 
down; would it still be a performance win?

Also, I'm not a security expert, but would the closing/reopening allow the 
possibility of timing attacks?  If so, that might still be okay for walk which 
makes no guarantees about safety.  (But obviously it would be unacceptable for 
fwalk.)


> Anyway, I think that such optimization is useless, because this
> micro-benchmark doesn't make much sense: when you walk a
> directory tree, it's usually to do something with the
> files/directories encountered, and as soon as you do something
> with them - stat(), unlink(), etc - the gain on the walking
> time will become negligible.

I'm not sure that "usually" is true here.  I suggest that "usually" people use 
os.walk to find *particular files* in a directory tree, generally by filename.  
So most of the time os.walk really is quickly iterating over directory trees 
doing very little.

I think 20% is a respectable gain, and it's hard for me to say "no" to 
functions that make Python faster for free.  (Well, for the possible cost of a 
slightly more expensive algorithm.)  So I'm +x for now.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15200>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15200] Faster os.walk

Reply via email to