[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-29 Thread Aviv Palivoda

Aviv Palivoda added the comment:

Thanks for the response Ethan I think that I will leave the tests as they are 
in the current patch.

> No doubt, there are exceptions to the rule in the standard library which is 
> less consistent than we might like:  "stat_result".  That said, stat_result 
> is a structseq and many C type names are old or violate the rules (list vs 
> List, etc).   New named tuples should follow PEP 8 can use CapWords 
> convention unless there is a strong reason not to in a particular case.

I actually thought we should keep on consistency with other "result" like 
objects. I can see your point about new named tuples that should follow PEP 8 
and DirEntry is an example of new "result" class that follow PEP8.
What names do you suggest? Maybe DirInfo and FDirInfo?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-29 Thread Raymond Hettinger

Raymond Hettinger added the comment:

https://www.python.org/dev/peps/pep-0008/#class-names -- "Class names should 
normally use the CapWords convention."

Examples:
-
crypt.py
6:from collections import namedtuple as _namedtuple
13:class _Method(_namedtuple('_Method', 'name ident salt_chars total_size')):

difflib.py
34:from collections import namedtuple as _namedtuple
36:Match = _namedtuple('Match', 'a b size')

dis.py
163:_Instruction = collections.namedtuple("_Instruction",
280:Generates a sequence of Instruction namedtuples giving the details of 
each

doctest.py
107:from collections import namedtuple
109:TestResults = namedtuple('TestResults', 'failed attempted')

functools.py
21:from collections import namedtuple
345:_CacheInfo = namedtuple("CacheInfo", ["hits", "misses", "maxsize", 
"currsize"])

inspect.py
51:from collections import namedtuple, OrderedDict
323:Attribute = namedtuple('Attribute', 'name kind defining_class object')
968:Arguments = namedtuple('Arguments', 'args, varargs, varkw')
1008:ArgSpec = namedtuple('ArgSpec', 'args varargs keywords defaults')
1032:FullArgSpec = namedtuple('FullArgSpec',
1124:ArgInfo = namedtuple('ArgInfo', 'args varargs keywords locals')
1317:ClosureVars = namedtuple('ClosureVars', 'nonlocals globals builtins 
unbound')
1372:Traceback = namedtuple('Traceback', 'filename lineno function code_context 
index')
1412:FrameInfo = namedtuple('FrameInfo', ('frame',) + Traceback._fields)

nntplib.py
159:GroupInfo = collections.namedtuple('GroupInfo',
162:ArticleInfo = collections.namedtuple('ArticleInfo',

No doubt, there are exceptions to the rule in the standard library which is 
less consistent than we might like:  "stat_result".  That said, stat_result is 
a structseq and many C type names are old or violate the rules (list vs List, 
etc).   New named tuples should follow PEP 8 can use CapWords convention unless 
there is a strong reason not to in a particular case.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-29 Thread Ethan Furman

Ethan Furman added the comment:

I'm not clear on what you asking, but regardless we should have both the old 
(by-index) tests and new by-attribute tests.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-29 Thread Aviv Palivoda

Aviv Palivoda added the comment:

In regard to Raymond`s points I agree with Serhiy`s comments.

As for Serhiy`s doubts:

> 3. Using namedtuple is slower and consumes more memory than using tuple. Even 
> for FS-related operation like os.walk() this can matter. A lot of code is 
> optimized for exact tuples, with namedtuple this optimization is lost.

I did some testing on my own PC:
./python -m timeit -s "from os import walk"  "for x in walk('Lib'): pass"

Regular tuple: 7.53 msec
Named tuple: 7.66 msec

> 4. New names (dirpath, dirnames, filenames) are questionable. Why not use 
> undersores (dir_names)? "dir" in dirpath refers to the current proceeded 
> directory, but "dir" in dirnames refers to it's subdirectories. Currently you 
> are free to use short names (root, dirs, files) from examples or what you 
> prefer, but with namedtuple you are sticked with standard names forever. 
> There are no names that satisfy everybody.

I agree that there will be no names that will satisfy everybody but I think the 
names that are currently in the documentation are the most trivial choice.

As for points 1,2,5 this feature doesn`t break any of the old walk API.

One more point I would like input on is the testing. I can remove the walk 
method from the WalkTests, FwalkTests classes and use the new named tuple 
attributes in the tests. Do you think its better or should we keep the tests 
with the old API (access using indexes)?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-28 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Sorry, but I disagree with Raymond in many points.

> Classes are normally named with CamelCase.  Also, "walk_result" or 
> "WalkResult" seems like an odd name that doesn't really fit.   DirEntry or 
> DirInfo is a better match (see the OP's example, "for dir_entry in walk_it: 
> ...")

See "stat_result", "statvfs_result", "waitid_result", "uname_result", and 
"times_result". DirEntry is already used in the os module. And if accept this 
feature, needed separate types for walk() and fwalk() results.

> The "versionchanged" should be a "versionadded".

os.walk() is not new. Just it's result is changed. Class "walk_result" can be 
tagged with "versionadded", but I'm not sure there is a need to document it 
separately. The documentation of the os module already too large. 
"uname_result" and "times_result" are not documented.

> The docs and code for fwalk() needs to be harmonized with walk() so the the 
> tuple fields use the same names:  change (root, dirs, files) to (dirpath, 
> dirnames, filenames).

(root, dirs, files) is shorter than (dirpath, dirnames, filenames) and these 
names were used with os.walk() and os.fwalk() for years.

I general, I have doubts about this feature.

1. There is little backward incompatibility. At least pickle is not backward 
compatible, and I guess other serialization methods.

2. os.walk() and os.fwalk() are purposed to be used in for loop with immediate 
unpacking result tuple:

for root, dirs, files in os.walk(...):
...

Adding named tuple doesn't add any benefit for common case.

In OP case, you can either use fwalk-based implementation of walk (issue15200):

def fwalk_as_walk(*args, **kwargs):
for x in os.fwalk(*args, **kwargs):
yield x[:-1]

or just ignore the rest of tuple items:

for root, *_ in walk_it:
...

3. Using namedtuple is slower and consumes more memory than using tuple. Even 
for FS-related operation like os.walk() this can matter. A lot of code is 
optimized for exact tuples, with namedtuple this optimization is lost.

4. New names (dirpath, dirnames, filenames) are questionable. Why not use 
undersores (dir_names)? "dir" in dirpath refers to the current proceeded 
directory, but "dir" in dirnames refers to it's subdirectories. Currently you 
are free to use short names (root, dirs, files) from examples or what you 
prefer, but with namedtuple you are sticked with standard names forever. There 
are no names that satisfy everybody.

5. Third-party walk-like iterators generate tuples, so you can't use attribute 
access in too general code.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-28 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Classes are normally named with CamelCase.  Also, "walk_result" or "WalkResult" 
seems like an odd name that doesn't really fit.   DirEntry or DirInfo is a 
better match (see the OP's example, "for dir_entry in walk_it: ...")

The "versionchanged" should be a "versionadded".

The docs should use "named tuple" instead of "namedtuple".  The former is the 
generic term used in the glossary to describe the instances.  The latter is the 
factory function that creates a new tuple subclass.

The attribute descriptions for the docs are pretty good.  They should also be 
applied as actual docstrings in the code as well.

The docs and code for fwalk() needs to be harmonized with walk() so the the 
tuple fields use the same names:  change (root, dirs, files) to (dirpath, 
dirnames, filenames).

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-26 Thread Ethan Furman

Ethan Furman added the comment:

Quick review of patch looks good.  I'll try to look it over more closely later.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-26 Thread Ethan Furman

Changes by Ethan Furman :


--
nosy: +ethan.furman

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

2016-04-26 Thread Aviv Palivoda

New submission from Aviv Palivoda:

I am suggesting that os.walk and os.fwalk will yield a namedtuple instead of 
the regular tuple they currently yield.
The use case for this change can be seen in the next example:

def walk_wrapper(walk_it):
for dir_entry in walk_it:
if dir_entry[0] == "aaa":
   yield dir_entry

Because walk_it can be either os.walk or os.fwalk I need to access dir_entry 
via index.

My change will allow me to change this function to:

def walk_wrapper(walk_it):
for dir_entry in walk_it:
if dir_entry.dirpath == "aaa":
   yield dir_entry

Witch is more clear and readable.

--
components: Library (Lib)
files: os-walk-result-namedtuple.patch
keywords: patch
messages: 264285
nosy: loewis, palaviv
priority: normal
severity: normal
status: open
title: os.walk and os.fwalk yield namedtuple instead of tuple
type: enhancement
versions: Python 3.6
Added file: http://bugs.python.org/file42612/os-walk-result-namedtuple.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com