[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

Serhiy Storchaka Thu, 28 Apr 2016 01:01:43 -0700

Serhiy Storchaka added the comment:

Sorry, but I disagree with Raymond in many points.


> Classes are normally named with CamelCase.  Also, "walk_result" or 
> "WalkResult" seems like an odd name that doesn't really fit.   DirEntry or 
> DirInfo is a better match (see the OP's example, "for dir_entry in walk_it: 
> ...")

See "stat_result", "statvfs_result", "waitid_result", "uname_result", and 
"times_result". DirEntry is already used in the os module. And if accept this 
feature, needed separate types for walk() and fwalk() results.

> The "versionchanged" should be a "versionadded".

os.walk() is not new. Just it's result is changed. Class "walk_result" can be 
tagged with "versionadded", but I'm not sure there is a need to document it 
separately. The documentation of the os module already too large. 
"uname_result" and "times_result" are not documented.

> The docs and code for fwalk() needs to be harmonized with walk() so the the 
> tuple fields use the same names:  change (root, dirs, files) to (dirpath, 
> dirnames, filenames).

(root, dirs, files) is shorter than (dirpath, dirnames, filenames) and these 
names were used with os.walk() and os.fwalk() for years.

I general, I have doubts about this feature.

1. There is little backward incompatibility. At least pickle is not backward 
compatible, and I guess other serialization methods.

2. os.walk() and os.fwalk() are purposed to be used in for loop with immediate 
unpacking result tuple:

    for root, dirs, files in os.walk(...):
        ...

Adding named tuple doesn't add any benefit for common case.

In OP case, you can either use fwalk-based implementation of walk (issue15200):

    def fwalk_as_walk(*args, **kwargs):
        for x in os.fwalk(*args, **kwargs):
            yield x[:-1]

or just ignore the rest of tuple items:

    for root, *_ in walk_it:
        ...

3. Using namedtuple is slower and consumes more memory than using tuple. Even 
for FS-related operation like os.walk() this can matter. A lot of code is 
optimized for exact tuples, with namedtuple this optimization is lost.

4. New names (dirpath, dirnames, filenames) are questionable. Why not use 
undersores (dir_names)? "dir" in dirpath refers to the current proceeded 
directory, but "dir" in dirnames refers to it's subdirectories. Currently you 
are free to use short names (root, dirs, files) from examples or what you 
prefer, but with namedtuple you are sticked with standard names forever. There 
are no names that satisfy everybody.

5. Third-party walk-like iterators generate tuples, so you can't use attribute 
access in too general code.

----------
nosy: +serhiy.storchaka

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue26860>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

Reply via email to