Re: [Python-Dev] itertools.walk()

2005-03-17 Thread Nick Coghlan
Bob Ippolito wrote:
I'm not sure why it's useful to explode the stack with all that 
recursion?  Mine didn't do that.  The control flow is nearly identical, 
but it looks more fragile (and you would get some really evil stack 
trace if iter_factory(foo) happened to raise something other than 
TypeError).
It was a holdover from my first version which *was* recursive. When I switched 
to using your chaining style, I didn't think to get rid of the now unneeded 
recursion.

So just drop the recursive calls to 'walk', and it should be back to your 
structure.
For the 'infinite recursion on basestring' example, PJE at one point suggested a 
"if item is iterable" guard that immediately yielded the item. Allowing breadth 
first operation requires something a little different, though:

from itertools import chain
def walk(iterable, depth_first=True, atomic_types=(basestring,), 
iter_factory=iter):
  itr = iter(iterable)
  while True:
for item in itr:
  if isinstance(item, atomic_types):
yield item
continue
  try:
subitr = iter_factory(item)
  except TypeError:
yield item
continue
  # Block simple cycles (like characters)
  try:
subitem = subitr.next()
  except StopIteration:
continue
  if subitem is item:
yield subitem
continue
  if depth_first:
itr = chain([subitem], subitr, itr)
  else:
itr = chain(itr, [subitem], subitr)
  break
else:
  break
Py> seq
[['123', '456'], 'abc', 'abc', 'abc', 'abc', ['xyz']]
Py> list(walk(seq))
['123', '456', 'abc', 'abc', 'abc', 'abc', 'xyz']
Py> list(walk(seq, depth_first=False))
['abc', 'abc', 'abc', 'abc', '123', '456', 'xyz']
Py> list(walk(seq, atomic_types=()))
['1', '2', '3', '4', '5', '6', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a',
 'b', 'c', 'x', 'y', 'z']
Py> list(walk(seq, depth_first=False, atomic_types=()))
['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', '1', '2', '3', '4',
 '5', '6', 'x', 'y', 'z']
Raymond may simply decide to keep things simple instead, of course.
Cheers,
Nick.
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] itertools.walk()

2005-03-16 Thread Bob Ippolito
On Mar 16, 2005, at 8:37 AM, Nick Coghlan wrote:
Bob Ippolito wrote:
On Mar 16, 2005, at 6:19, Raymond Hettinger wrote:
Some folks on comp.lang.python have been pushing for itertools to
include a flatten() operation.  Unless you guys have some thoughts on
the subject, I'm inclined to accept the request.
Rather than calling it flatten(), it would be called "walk" and 
provide
a generalized capability to descend through nested iterables 
(similar to
what os.walk does for directories).  The one wrinkle is having a
stoplist argument to specify types that should be considered atomic
eventhough they might be iterable (strings for example).
You could alternatively give them a way to supply their own "iter" 
function, like the code I demonstrate below:
I think the extra flexibility ends up making the function harder to 
comprehend and use. Here's a version with a simple stoplist:
...
This makes it easy to reclassify certain things like dictionaries or 
tuples as atomic elements.

> # maybe there should be a bfswalk too?
By putting the 'walk(subitr)' after the current itr when chaining?
Yeah
If Raymond does decide to go for the flexible approach rather than the 
simple one, then I'd vote for a full-featured approach like:
I don't mind that at all.  It's certainly convenient to have an easy 
stoplist.  The problem with the way you have implemented it is that 
basestring will cause infinite recursion if you use the built-in iter, 
so if you provide your own atomic_types, you damn well better remember 
to add in basestring.

def walk(iterable, depth_first=True, atomic_types=(basestring,), 
iter_factory=iter):
  itr = iter(iterable)
  while True:
for item in itr:
  if isinstance(item, atomic_types):
yield item
continue
  try:
subitr = iter_factory(item)
  except TypeError:
yield item
  else:
if depth_first:
  itr = chain(walk(subitr), itr)
else:
  itr = chain(itr, walk(subitr))
break
else:
  break
I'm not sure why it's useful to explode the stack with all that 
recursion?  Mine didn't do that.  The control flow is nearly identical, 
but it looks more fragile (and you would get some really evil stack 
trace if iter_factory(foo) happened to raise something other than 
TypeError).

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] itertools.walk()

2005-03-16 Thread Nick Coghlan
Bob Ippolito wrote:
On Mar 16, 2005, at 6:19, Raymond Hettinger wrote:
Some folks on comp.lang.python have been pushing for itertools to
include a flatten() operation.  Unless you guys have some thoughts on
the subject, I'm inclined to accept the request.
Rather than calling it flatten(), it would be called "walk" and provide
a generalized capability to descend through nested iterables (similar to
what os.walk does for directories).  The one wrinkle is having a
stoplist argument to specify types that should be considered atomic
eventhough they might be iterable (strings for example).

You could alternatively give them a way to supply their own "iter" 
function, like the code I demonstrate below:
I think the extra flexibility ends up making the function harder to comprehend 
and use. Here's a version with a simple stoplist:

from itertools import chain
def walk(iterable, atomic_types=(basestring,)):
  itr = iter(iterable)
  while True:
for item in itr:
  if isinstance(item, atomic_types):
yield item
continue
  try:
subitr = iter(item)
  except TypeError:
yield item
  else:
itr = chain(walk(subitr), itr)
break
else:
  break
This makes it easy to reclassify certain things like dictionaries or tuples as 
atomic elements.

> # maybe there should be a bfswalk too?
By putting the 'walk(subitr)' after the current itr when chaining?
If Raymond does decide to go for the flexible approach rather than the simple 
one, then I'd vote for a full-featured approach like:

def walk(iterable, depth_first=True, atomic_types=(basestring,), 
iter_factory=iter):
  itr = iter(iterable)
  while True:
for item in itr:
  if isinstance(item, atomic_types):
yield item
continue
  try:
subitr = iter_factory(item)
  except TypeError:
yield item
  else:
if depth_first:
  itr = chain(walk(subitr), itr)
else:
  itr = chain(itr, walk(subitr))
break
else:
  break
Cheers,
Nick.
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] itertools.walk()

2005-03-16 Thread Bob Ippolito
On Mar 16, 2005, at 6:19, Raymond Hettinger wrote:
Some folks on comp.lang.python have been pushing for itertools to
include a flatten() operation.  Unless you guys have some thoughts on
the subject, I'm inclined to accept the request.
Rather than calling it flatten(), it would be called "walk" and provide
a generalized capability to descend through nested iterables (similar 
to
what os.walk does for directories).  The one wrinkle is having a
stoplist argument to specify types that should be considered atomic
eventhough they might be iterable (strings for example).
You could alternatively give them a way to supply their own "iter" 
function, like the code I demonstrate below:

from itertools import chain
def nostring_iter(obj):
if isinstance(obj, basestring):
raise TypeError
return iter(obj)
def uniqueiter_factory(iterfunc=nostring_iter):
def uniqueiter(obj, uniques={}):
iterable = iterfunc(obj)
if id(obj) in uniques:
raise TypeError
uniques[id(obj)] = obj
return iterable
return uniqueiter
# maybe there should be a bfswalk too?
def walk(iterable, iterfunc=nostring_iter):
iterables = iter((iterable,))
while True:
for obj in iterables:
try:
iterable = iterfunc(obj)
except TypeError:
yield obj
else:
iterables = chain(iterable, iterables)
break
else:
break
>>> data = [('foo', 'bar'), 'baz', 5]
>>> list(walk(data))
['foo', 'bar', 'baz', 5]
>>> list(walk(data, uniqueiter_factory(iter)))
['f', 'o', 'o', 'b', 'a', 'r', 'b', 'a', 'z', 5]
>>> data.append(data)
>>> list(walk(data, uniqueiter_factory()))
['foo', 'bar', 'baz', 5, [('foo', 'bar'), 'baz', 5, [...]]]
-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com