[issue15200] Faster os.walk

2012-10-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Timing of walk depends on how deep we dive into the directories. $ ./python -m timeit -s from os import walk for x in walk('/home/serhiy/py/1/2/3/4/5/6/7/8/9/cpython/'): pass 10 loops, best of 3: 398 msec per loop $ ./python -m timeit -s from os import

[issue15200] Faster os.walk

2012-06-27 Thread Serhiy Storchaka
) files: faster_walk.patch keywords: patch messages: 164127 nosy: storchaka priority: normal severity: normal status: open title: Faster os.walk type: performance versions: Python 3.4 Added file: http://bugs.python.org/file26175/faster_walk.patch ___ Python

[issue15200] Faster os.walk

2012-06-27 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- nosy: +larry ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15200 ___ ___ Python-bugs-list

[issue15200] Faster os.walk

2012-06-27 Thread Larry Hastings
Larry Hastings la...@hastings.org added the comment: It's amusing that using fwalk and throwing away the last argument is faster than a handwritten implementation. On the other hand, fwalk also uses a lot of file descriptors. Users with processes which were already borderline on max file

[issue15200] Faster os.walk

2012-06-27 Thread Charles-François Natali
Charles-François Natali neolo...@free.fr added the comment: On the other hand, fwalk also uses a lot of file descriptors. Users with processes which were already borderline on max file descriptors might not appreciate upgrading to find their os.walk calls suddenly failing. It doesn't

[issue15200] Faster os.walk

2012-06-27 Thread Larry Hastings
Larry Hastings la...@hastings.org added the comment: It doesn't have to. Right now, it uses O(depth of the directory tree) FDs. It can be changed to only require O(1) FDs But closing and reopening those file descriptors seems like it might slow it down; would it still be a performance win?

[issue15200] Faster os.walk

2012-06-27 Thread Arfrever Frehtes Taifersar Arahesis
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com: -- nosy: +Arfrever ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15200 ___

[issue15200] Faster os.walk

2012-06-27 Thread Ross Lagerwall
Ross Lagerwall rosslagerw...@gmail.com added the comment: This looks like the kind of optimization that depends hugely on what kernel you're using. Maybe on FreeBSD/Solaris/whatever, standard os.walk() is faster? If this micro-optimization were to be accepted, someone would have to be keen

[issue15200] Faster os.walk

2012-06-27 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: This looks like the kind of optimization that depends hugely on what kernel you're using. Agreed. Also, I'm worried that there might be subtle differences between walk() and fwalk() which could come and bite users if we silently redirect the

Faster os.walk()

2005-04-20 Thread fuzzylollipop
I am trying to get the number of bytes used by files in a directory. I am using a large directory ( lots of stuff checked out of multiple large cvs repositories ) and there is lots of wasted time doing multiple os.stat() on dirs and files from different methods. --

Re: Faster os.walk()

2005-04-20 Thread Laszlo Zsolt Nagy
fuzzylollipop wrote: I am trying to get the number of bytes used by files in a directory. I am using a large directory ( lots of stuff checked out of multiple large cvs repositories ) and there is lots of wasted time doing multiple os.stat() on dirs and files from different methods. Do you need

Re: Faster os.walk()

2005-04-20 Thread Peter Hansen
Laszlo Zsolt Nagy wrote: fuzzylollipop wrote: I am trying to get the number of bytes used by files in a directory. I am using a large directory ( lots of stuff checked out of multiple large cvs repositories ) and there is lots of wasted time doing multiple os.stat() on dirs and files from

Re: Faster os.walk()

2005-04-20 Thread fuzzylollipop
du is faster than my code that does the same thing in python, it is highly optomized at the os level. that said, I profiled spawning an external process to call du and over the large number of times I need to do this it is actually slower to execute du externally than my os.walk() implementation.

Re: Faster os.walk()

2005-04-20 Thread Philippe C. Martin
How about rerouting stdout/err and 'popening something like /bin/find -name '*' -exec a_script_or_cmd_that_does_what_i_want_with_the_file {} \; ? Regards, Philippe fuzzylollipop wrote: du is faster than my code that does the same thing in python, it is highly optomized at the os level.

Re: Faster os.walk()

2005-04-20 Thread Kent Johnson
fuzzylollipop wrote: after extensive profiling I found out that the way that os.walk() is implemented it calls os.stat() on the dirs and files multiple times and that is where all the time is going. os.walk() is pretty simple, you could copy it and make your own version that calls os.stat() just

Re: Faster os.walk()

2005-04-20 Thread Nick Craig-Wood
fuzzylollipop [EMAIL PROTECTED] wrote: I am trying to get the number of bytes used by files in a directory. I am using a large directory ( lots of stuff checked out of multiple large cvs repositories ) and there is lots of wasted time doing multiple os.stat() on dirs and files from

Re: Faster os.walk()

2005-04-20 Thread Lonnie Princehouse
If you're trying to track changes to files on (e.g. by comparing current size with previously recorded size), fam might obviate a lot of filesystem traversal. http://python-fam.sourceforge.net/ -- http://mail.python.org/mailman/listinfo/python-list

Re: Faster os.walk()

2005-04-20 Thread fuzzylollipop
ding, ding, ding, we have a winner. One of the guys on the team did just this, he re-implemented the os.walk() logic and embedded the logic to the S_IFDIR, S_IFMT and S_IFREG directly into the transversal code. This is all going to run on unix or linux machines in production so this is not a big