On 9/20/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > [Guido] > > I just finished debugging some code that broke after upgrading to > > Python 2.4 (from 2.3). Turns out the code was testing list iterators > > for their boolean value (to distinguish them from None). In 2.3, a > > list iterator (like any iterator) is always true. In 2.4, an exhausted > > list iterator is false; probably by virtue of having a __len__() > > method that returns the number of remaining items. > > > > I realize that this was a deliberate feature, and that it exists in > > 2.4 as well as in 2.4.1 and will in 2.4.2; yet, I'm not sure I *like* > > it. Was this breakage (which is not theoretical!) considered at all? > > It was not considered.
That's too bad. > AFAICT, 2.3 code assuming the Boolean value of > an iterator being true was relying on an accidental implementation > detail that may not also be true in Jython, PyPy, etc. That's bullshit, and you know it -- you're just using this to justify that you didn't think of this. Whether an object is true or not is well-defined by the language and not by an accident of the implementation. Apart from None, all objects are always true unless they define either __nonzero__() or (in its absence) __len__(). The iterators for builtin sequences were carefully designed to have the minimal API required of iterators -- i.e., next() and __iter__() and nothing more. > Likewise, it is > not universally true for arbitrary class based iterators which may have > other methods including __nonzero__ or __len__. And those are the *only* ones that affect the boolean value. > The Boolean value of an > iterator is certainly not promised by the iterator protocol as specified > in the docs or the PEP. it was implied by not specifying a __nonzero__ or __len__. > The code, bool(it), is not really clear about > its intent and seems a little weird to me. Of course that's not what the broken code actually looked like. If was something like if ...: iter1 = iter(...) else: iter1 = None if ...: iter2 = iter(...) else: iter2 = None ... if iter1 and iter2: ... Where the arguments to the iter() functions were known to be lists. > The reason it wasn't > considered was that it wasn't on the radar screen as even a possible use > case. Could you at least admit that this was an oversight and not try to pretend it was intentional breakage? > On a tangential note, I think in 2.2 or 2.3, we found a number of bugs > related to None testing. IIRC, the outcome of that conversation was a > specific recommendation to NOT determine Noneness by Boolean tests. > That recommendation ended-up making it into PEP 290: > > http://www.python.org/peps/pep-0290.html#testing-for-none And I agree with that one in general (I was bitten by this in Zope once). But it bears a lot more weight when the type of the object is unknown or partially unknown. In my case, there was no possibility that the iter() argument was anything except a list, so the type of the iterator was fully known. > [Fred] > > think iterators shouldn't have length at all: > > they're *not* containers and shouldn't act that way. > > Some iterators can usefully report their length with the invariant: > len(it) == len(list(it)). I still consider this an erroneous hypergeneralization of the concept of iterators. Iterators should be pure iterators and not also act as containers. Which other object type implements __len__ but not __getitem__? > There are some use cases for having the length when available. Also, > there has been plenty of interest in being able to tell, when possible, > if an iterator is empty without having to call it. AFAICT, the only > downside was Guido's bool(it) situation. Theer is plenty of interest in broken features all the time. IMO giving *some* iterators discoverable length (and other properties like reversability) but not all of them makes the iterator protocol more error-prone -- we're back to the situation where someone codes an algorithm for use with arbitrary iterators but only tests it with list and tuple iterators, and ends up breaking in the field. I know you can work around it, but that requires introspection which is not a great match for this kind of application. > FWIW, the origin of the idea came from reading a comp-sci paper about > ways to overcome the limitations of linking operations together using > only iterators (the paper's terminology talked about map/fold > operations). The issue was that decoupling benefits were partially > offset by the loss of useful information about the input to an operation > (i.e. the supplier may know and the consumer may want to know the input > size, the input type, whether the elements are unique, whether the data > is sorted, its provenance, etc.) Not every idea written up in a comp-sci paper is worth implementing (as acquisition in Zope 2 has amply proved). -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com