Nested iteration?
In reviewing somebody else's code today, I found the following construct (eliding some details): f = open(filename) for line in f: if re.search(pattern1, line): outer_line = f.next() for inner_line in f: if re.search(pattern2, inner_line): inner_line = f.next() Somewhat to my surprise, the code worked. I didn't know it was legal to do nested iterations over the same iterable (not to mention mixing calls to next() with for-loops). Is this guaranteed to work in all situations? -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On 23 April 2013 16:40, Roy Smith r...@panix.com wrote: In reviewing somebody else's code today, I found the following construct (eliding some details): f = open(filename) for line in f: if re.search(pattern1, line): outer_line = f.next() for inner_line in f: if re.search(pattern2, inner_line): inner_line = f.next() Somewhat to my surprise, the code worked. I didn't know it was legal to do nested iterations over the same iterable (not to mention mixing calls to next() with for-loops). Is this guaranteed to work in all situations? For Python 3 you'd need next(f) instead of f.next(). Otherwise, yes, this works just fine with any non-restarting iterator (i.e. so that __iter__ just returns self rather than a new iterator). I recently posted in another thread about why it's a bad idea to call next() without catching StopIteration though. I wouldn't accept the code above for that reason. Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On Tue, Apr 23, 2013 at 9:40 AM, Roy Smith r...@panix.com wrote: In reviewing somebody else's code today, I found the following construct (eliding some details): f = open(filename) for line in f: if re.search(pattern1, line): outer_line = f.next() for inner_line in f: if re.search(pattern2, inner_line): inner_line = f.next() Somewhat to my surprise, the code worked. I didn't know it was legal to do nested iterations over the same iterable (not to mention mixing calls to next() with for-loops). Is this guaranteed to work in all situations? Yes, although the results will be different depending on whether the iterable stores its iteration state on itself (like a file object) or in the iterator (like a list). In the latter case, you would simply have two independent simultaneous iterations of the same object. You can replicate the same effect in the latter case though by getting an iterator from the object and explicitly looping over the same iterator, like so: i = iter(range(10)) for x in i: if x % 4 == 1: for y in i: if y % 4 == 3: print(%d + %d = %d % (x, y, x+y)) break -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
Roy Smith wrote: In reviewing somebody else's code today, I found the following construct (eliding some details): f = open(filename) for line in f: if re.search(pattern1, line): outer_line = f.next() for inner_line in f: if re.search(pattern2, inner_line): inner_line = f.next() Somewhat to my surprise, the code worked. I didn't know it was legal to do nested iterations over the same iterable (not to mention mixing calls to next() with for-loops). Is this guaranteed to work in all situations? That depends on what you mean by all. A well-behaved iterator like Python's file object allows mixing of for loops and next(...) calls, but stupid people who deserve to burn in hell sometimes do class MyIterable: def __iter__(self): reset_internal_counter() return self with the consequence that every for loop implicitly resets the iterator's state. -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On Wed, Apr 24, 2013 at 1:40 AM, Roy Smith r...@panix.com wrote: In reviewing somebody else's code today, I found the following construct (eliding some details): f = open(filename) for line in f: if re.search(pattern1, line): outer_line = f.next() for inner_line in f: if re.search(pattern2, inner_line): inner_line = f.next() Somewhat to my surprise, the code worked. I didn't know it was legal to do nested iterations over the same iterable (not to mention mixing calls to next() with for-loops). Is this guaranteed to work in all situations? The definition of the for loop is sufficiently simple that this is safe, with the caveat already mentioned (that __iter__ is just returning self). And calling next() inside the loop will simply terminate the loop if there's nothing there, so I'd not have a problem with code like that - for instance, if I wanted to iterate over pairs of lines, I'd happily do this: for line1 in f: line2=next(f) print(line2) print(line1) That'll happily swap pairs, ignoring any stray line at the end of the file. Why bother catching StopIteration just to break? ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On Tue, Apr 23, 2013 at 10:21 AM, Chris Angelico ros...@gmail.com wrote: The definition of the for loop is sufficiently simple that this is safe, with the caveat already mentioned (that __iter__ is just returning self). And calling next() inside the loop will simply terminate the loop if there's nothing there, so I'd not have a problem with code like that - for instance, if I wanted to iterate over pairs of lines, I'd happily do this: for line1 in f: line2=next(f) print(line2) print(line1) That'll happily swap pairs, ignoring any stray line at the end of the file. Why bother catching StopIteration just to break? The next() there will *not* simply terminate the loop if it raises a StopIteration; for loops do not catch StopIteration exceptions that are raised from the body of the loop. The StopIteration will continue to propagate until it is caught or it reaches the sys.excepthook. In unusual circumstances, it is even possible that it could cause some *other* loop higher in the stack to break (i.e. if the current code is being run as a result of the next() method being called by the looping construct). -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On Tue, 23 Apr 2013 11:40:31 -0400, Roy Smith wrote: In reviewing somebody else's code today, I found the following construct (eliding some details): f = open(filename) for line in f: if re.search(pattern1, line): outer_line = f.next() for inner_line in f: if re.search(pattern2, inner_line): inner_line = f.next() Somewhat to my surprise, the code worked. I didn't know it was legal to do nested iterations over the same iterable (not to mention mixing calls to next() with for-loops). Is this guaranteed to work in all situations? In all situations? No of course not, this is Python, you can write nasty code that explodes the *second* time you iterate over it, but not the first. class Demo: flag = False def __iter__(self): if self.flag: raise RuntimeError(don't do that!) self.flag = True return iter([1, 2, 3]) But under normal circumstances with normal iterables, yes, it's fine. If the object is a sequence, like lists or strings, each for-loop is independent of the others: py s = ab py for c in s: ... for k in s: ... print c, k ... a a a b b a b b If the object is an iterator, each loop consumes a single value: py it = iter(abcd) py for c in it: ... for k in it: ... print c, k ... a b a c a d Each time you call next(), a single value is consumed. It doesn't matter whether you have one for-loop calling next() behind the scenes, or ten loops, or you call next() yourself, the same rule applies. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On Tue, Apr 23, 2013 at 10:30 AM, Ian Kelly ian.g.ke...@gmail.com wrote: On Tue, Apr 23, 2013 at 10:21 AM, Chris Angelico ros...@gmail.com wrote: The definition of the for loop is sufficiently simple that this is safe, with the caveat already mentioned (that __iter__ is just returning self). And calling next() inside the loop will simply terminate the loop if there's nothing there, so I'd not have a problem with code like that - for instance, if I wanted to iterate over pairs of lines, I'd happily do this: for line1 in f: line2=next(f) print(line2) print(line1) That'll happily swap pairs, ignoring any stray line at the end of the file. Why bother catching StopIteration just to break? The next() there will *not* simply terminate the loop if it raises a StopIteration; for loops do not catch StopIteration exceptions that are raised from the body of the loop. The StopIteration will continue to propagate until it is caught or it reaches the sys.excepthook. In unusual circumstances, it is even possible that it could cause some *other* loop higher in the stack to break (i.e. if the current code is being run as a result of the next() method being called by the looping construct). To expand on this, the prevailing wisdom here is that calls to next() should always be guarded with a StopIteration exception handler. The one exception to this is when the next() call is inside the body of a generator function, and the exception handler would cause the generator to exit anyway; in that case there is little difference between except StopIteration: return and letting the StopIteration propagate to the generator object. -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On Wed, Apr 24, 2013 at 2:30 AM, Ian Kelly ian.g.ke...@gmail.com wrote: On Tue, Apr 23, 2013 at 10:21 AM, Chris Angelico ros...@gmail.com wrote: The definition of the for loop is sufficiently simple that this is safe, with the caveat already mentioned (that __iter__ is just returning self). And calling next() inside the loop will simply terminate the loop if there's nothing there, so I'd not have a problem with code like that - for instance, if I wanted to iterate over pairs of lines, I'd happily do this: for line1 in f: line2=next(f) print(line2) print(line1) That'll happily swap pairs, ignoring any stray line at the end of the file. Why bother catching StopIteration just to break? The next() there will *not* simply terminate the loop if it raises a StopIteration; for loops do not catch StopIteration exceptions that are raised from the body of the loop. The StopIteration will continue to propagate until it is caught or it reaches the sys.excepthook. In unusual circumstances, it is even possible that it could cause some *other* loop higher in the stack to break (i.e. if the current code is being run as a result of the next() method being called by the looping construct). Ah, whoops, my bad. This is what I get for not checking. I know I've done weird stuff with for loops before, but I guess it was fiddling inside the top of it, not in its body. I love this list. If I make a mistake, it's sure to be caught by someone else. The record is guaranteed to be set straight. Thanks Ian! ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On Wed, 24 Apr 2013 02:42:41 +1000, Chris Angelico wrote: I love this list. If I make a mistake, it's sure to be caught by someone else. No it's not! Are-you-here-for-the-five-minute-argument-or-the-full-ten-minutes-ly y'rs, -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On 4/23/2013 11:40 AM, Roy Smith wrote: In reviewing somebody else's code today, I found the following construct (eliding some details): f = open(filename) for line in f: if re.search(pattern1, line): outer_line = f.next() for inner_line in f: if re.search(pattern2, inner_line): inner_line = f.next() Did you possibly elide a 'break' after the inner_line assignment? Somewhat to my surprise, the code worked. Without a break, the inner loop will continue iterating through the rest of the file (billions of lines?) looking for pattern2 and re-binding inner-line if there is another line or raising StopIteration if there is not. Does this really constitute 'working'? This is quite aside from issue of what one wants if there is no pattern1 or if there is no line after the first match (probably not StopIteration) or if there is no pattern2. I didn't know it was legal to do nested iterations over the same iterable Yes, but the effect is quite different for iterators (start where the outer iteration left off) and non-iterators (restart at the beginning). r = range(2) for i in r: for j in r: print(i,j) # this is a common idiom to get all pairs 0 0 0 1 1 0 1 1 ri= iter(range(3)) for i in ri: for j in ri: print(i,j) # this is somewhat deceptive as the outer loop executes just once 0 1 0 2 I personally would add a 'break' after 'outer_line = next(f)', since the first loop is effectively done anyway at that point, and dedent the second for statement. I find to following clearer ri= iter(range(3)) for i in ri: break for j in ri: print(i,j) # this makes it clear that the first loop executes just once 0 1 0 2 I would only nest if the inner loop could terminate without exhausting the iterator and I wanted the outer loop to then resume. __ Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On 23 April 2013 21:49, Terry Jan Reedy tjre...@udel.edu wrote: ri= iter(range(3)) for i in ri: for j in ri: print(i,j) # this is somewhat deceptive as the outer loop executes just once 0 1 0 2 I personally would add a 'break' after 'outer_line = next(f)', since the first loop is effectively done anyway at that point, and dedent the second for statement. I find to following clearer ri= iter(range(3)) for i in ri: break for j in ri: print(i,j) # this makes it clear that the first loop executes just once 0 1 0 2 I would only nest if the inner loop could terminate without exhausting the iterator and I wanted the outer loop to then resume. Surely a normal programmer would think next(ri, None) rather than a loop that just breaks. -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On 23 April 2013 17:30, Ian Kelly ian.g.ke...@gmail.com wrote: On Tue, Apr 23, 2013 at 10:21 AM, Chris Angelico ros...@gmail.com wrote: The definition of the for loop is sufficiently simple that this is safe, with the caveat already mentioned (that __iter__ is just returning self). And calling next() inside the loop will simply terminate the loop if there's nothing there, so I'd not have a problem with code like that - for instance, if I wanted to iterate over pairs of lines, I'd happily do this: for line1 in f: line2=next(f) print(line2) print(line1) That'll happily swap pairs, ignoring any stray line at the end of the file. Why bother catching StopIteration just to break? The next() there will *not* simply terminate the loop if it raises a StopIteration; for loops do not catch StopIteration exceptions that are raised from the body of the loop. The StopIteration will continue to propagate until it is caught or it reaches the sys.excepthook. In unusual circumstances, it is even possible that it could cause some *other* loop higher in the stack to break (i.e. if the current code is being run as a result of the next() method being called by the looping construct). I don't find that the circumstances are unusual. Pretty much any time one of the functions in the call stack is a generator this problem will occur if StopIteration propagates. I just thought I'd add that Python 3 has a convenient way to avoid this problem with next() which is to use the starred unpacking syntax: numbers = [1, 2, 3, 4] first, *numbers = numbers first 1 for x in numbers: ... print(x) ... 2 3 4 first, *numbers = [] Traceback (most recent call last): File stdin, line 1, in module ValueError: need more than 0 values to unpack Since we get a ValueError instead of a StopIteration we don't have the described problem. Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On 23 April 2013 22:29, Oscar Benjamin oscar.j.benja...@gmail.com wrote: I just thought I'd add that Python 3 has a convenient way to avoid this problem with next() which is to use the starred unpacking syntax: numbers = [1, 2, 3, 4] first, *numbers = numbers That creates a new list every time. You'll not want that over try-next-except if you're doing this in a loop, and on addition (if you were talking in context) your method will exhaust the iterator in the outer loop. -- http://mail.python.org/mailman/listinfo/python-list
Re: Nested iteration?
On 23 April 2013 22:41, Joshua Landau joshua.landau...@gmail.com wrote: On 23 April 2013 22:29, Oscar Benjamin oscar.j.benja...@gmail.com wrote: I just thought I'd add that Python 3 has a convenient way to avoid this problem with next() which is to use the starred unpacking syntax: numbers = [1, 2, 3, 4] first, *numbers = numbers That creates a new list every time. You'll not want that over try-next-except if you're doing this in a loop, and on addition (if you were talking in context) your method will exhaust the iterator in the outer loop. Oh, you're right. I'm not using Python 3 yet and I assumed without checking that it would be giving me an iterator rather than unpacking everything into a list. Then the best I can think of is a helper function: def unpack(iterable, count): ... iterator = iter(iterable) ... for n in range(count): ... yield next(iterator) ... yield iterator ... numbers = [1, 2, 3, 4] first, numbers = unpack(numbers, 1) first 1 numbers list_iterator object at 0x24e1590 list(numbers) [2, 3, 4] first, numbers = unpack([], 1) Traceback (most recent call last): File stdin, line 1, in module ValueError: need more than 0 values to unpack Oscar -- http://mail.python.org/mailman/listinfo/python-list