New submission from Peter Hammer <pham...@cardious.com>: """ A point of confusion using the builtin function 'enumerate' and enlightenment for those who, like me, have been confused.
Note, this confusion was discussed at length at http://bugs.python.org/issue2831 prior to the 'start' parameter being added to 'enumerate'. The confusion discussed herein was forseen in that discussion, and ultimately discounted. There remains, IMO, an issue with the clarity of the documentation that needs to be addressed. That is, the closed issue at http://bugs.python.org/issue8635 concerning the 'enumerate' docstring does not address the confusion that prompted this posting. Consider: x=['a','b','c','d','e'] y=['f','g','h','i','j'] print 0,y[0] for i,c in enumerate(y,1): print i,c if c=='g': print x[i], 'y[%i]=g' % (i) continue print x[i] This code produces the following unexpected output, using python 2.7, which is apparently the correct behavior (see commentary below). This example is an abstract simplification of a program defect encountered in practice: >>> 0 f 1 f b 2 g c y[2]=g 3 h d 4 i e 5 j Traceback (most recent call last): File "Untitled", line 9 print x[i] IndexError: list index out of range Help on 'enumerate' yields: >>> help(enumerate) Help on class enumerate in module __builtin__: class enumerate(object) | enumerate(iterable[, start]) -> iterator for index, value of iterable | | Return an enumerate object. iterable must be another object that supports | iteration. The enumerate object yields pairs containing a count (from | start, which defaults to zero) and a value yielded by the iterable argument. | enumerate is useful for obtaining an indexed list: | (0, seq[0]), (1, seq[1]), (2, seq[2]), ... | | Methods defined here: | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __iter__(...) | x.__iter__() <==> iter(x) | | next(...) | x.next() -> the next value, or raise StopIteration | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __new__ = <built-in method __new__ of type object> | T.__new__(S, ...) -> a new object with type S, a subtype of T >>> Commentary: The expected output was: >>> 0 f 1 g b y[2]=g 2 h c 3 i d 4 j e >>> That is, it was expected that the iterator would yield a value corresponding to the index, whether the index started at zero or not. Using the notation of the doc string, with start=1, the expected behavior was: | (1, seq[1]), (2, seq[2]), (3, seq[3]), ... while the actual behavior is: | (1, seq[0]), (2, seq[1]), (3, seq[2]), ... The practical problem in the real world code was to do something special with the zero index value of x and y, then run through the remaining values, doing one of two things with x and y, correlated, depending on the value of y. I can see now that the doc string does in fact correctly specify the actual behavior: nowhere does it say the iterator will begin at any other place than the beginning, so this is not a python bug. I do however question the general usefulness of such behavior. Normally, indices and values are expected to be correlated. The correct behavior can be simply implemented without using 'enumerate': x=['a','b','c','d','e'] y=['f','g','h','i','j'] print 0,y[0] for i in xrange(1,len(y)): c=y[i] print i,c if c=='g': print x[i], 'y[%i]=g' % (i) continue print x[i] This produces the expected results. If one insists on using enumerate to produce the correct behavior in this example, it can be done as follows: """ x=['a','b','c','d','e'] y=['f','g','h','i','j'] seq=enumerate(y) print '%s %s' % seq.next() for i,c in seq: print i,c if c=='g': print x[i], 'y[%i]=g' % (i) continue print x[i] """ This version produces the expected results, while achieving clarity comparable to that which was sought in the original incorrect code. Looking a little deeper, the python documentation on enumerate states: enumerate(sequence[, start=0]) Return an enumerate object. sequence must be a sequence, an iterator, or some other object which supports iteration. The next() method of the iterator returned by enumerate() returns a tuple containing a count (from start which defaults to 0) and the corresponding value obtained from iterating over iterable. enumerate() is useful for obtaining an indexed series: (0, seq[0]), (1, seq[1]), (2, seq[2]), This makes a pretty clear implication the value corresponds to the index, so perhaps there really is an issue here. Have at it. I'm going back to work, using 'enumerate' as it actually is, now that I clearly understand it. One thing is certain: the documentation has to be clarified, for the confusion foreseen prior to adding the start parameter is very real. """ ---------- components: None messages: 134162 nosy: phammer priority: normal severity: normal status: open title: 'enumerate' 'start' parameter documentation is confusing type: behavior versions: Python 2.7 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11889> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com