On Tue, Dec 11, 2018 at 12:48:10PM +0100, E. Madison Bray wrote: > Right now I'm specifically responding to the sub-thread that Greg > started "Suggested MapView object", so I'm considering this a mostly > clean slate from the previous thread "__len__() for map()". Different > ideas have been tossed around and the discussion has me thinking about > broader possibilities. I responded to this thread because I liked > Greg's proposal and the direction he's suggesting.
Greg's code can be found here: https://mail.python.org/pipermail/python-ideas/2018-December/054659.html His MapView tries to be both an iterator and a sequence at the same time, but it is neither. The iterator protocol is that iterators must: - have a __next__ method; - have an __iter__ method which returns self; and the test for an iterator is: obj is iter(obj) https://docs.python.org/3/library/stdtypes.html#iterator-types Greg's MapView object is an *iterable* with a __next__ method, which makes it neither a sequence nor a iterator, but a hybrid that will surprise people who expect it to act considently as either. This is how iterators work: py> x = iter("abcdef") # An actual iterator. py> next(x) 'a' py> next(x) 'b' py> next(iter(x)) 'c' Greg's hybrid violates that expected behaviour: py> x = MapView(str.upper, "abcdef") # An imposter. py> next(x) 'A' py> next(x) 'B' py> next(iter(x)) 'A' As an iterator, it is officially "broken", continuing to yield values even after it is exhausted: py> x = MapView(str.upper, 'a') py> next(x) 'A' py> next(x) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/steve/gregmapview.py", line 24, in __next__ return next(self.iterator) StopIteration py> list(x) # But wait! There's more! ['A'] py> list(x) # And even more! ['A'] This hybrid is fragile: whether operations succeed or not depend on the order that you call them: py> x = MapView(str.upper, "abcdef") py> len(x)*next(x) # Safe. But only ONCE. 'AAAAAA' py> y = MapView(str.upper, "uvwxyz") py> next(y)*len(y) # Looks safe. But isn't. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/steve/gregmapview.py", line 12, in __len__ raise TypeError("Mapping iterator has no len()") TypeError: Mapping iterator has no len() (For brevity, from this point on I shall trim the tracebacks and show only the final error message.) Things that work once, don't work a second time. py> len(x)*next(x) # Worked a moment ago, but now it is broken. TypeError: Mapping iterator has no len() If you pass your MapView object to another function, it can accidentally sabotage your code: py> def innocent_looking_function(obj): ... next(obj) ... py> x = MapView(str.upper, "abcdef") py> len(x) 6 py> innocent_looking_function(x) py> len(x) TypeError: Mapping iterator has no len() I presume this is just an oversight, but indexing continues to work even when len() has been broken. Greg seems to want to blame the unwitting coder who runs into these boobytraps: "But there are no surprises as long as you stick to one interface or the other. Weird things happen if you mix them up, but sane code won't be doing that." (URL as above). This MapView class offers a hybrid "sequence plus iterator, together at last!" double-headed API, and even its creator says that sane code shouldn't use that API. Unfortunately, you can't use the iterator API, because its broken as an iterator, and you can't use it as a sequence, because any function you pass it to might use it as an iterator and pull the rug out from under your feet. Greg's code is, apart from the addition of the __next__ method, almost identical to the version of mapview I came up with in my own testing. Except Greg's is even better, since I didn't bother handling the multiple-sequences case and his does. Its the __next__ method which ruins it, by trying to graft on almost- but-not-really iterator behaviour onto something which otherwise is a sequence. I don't think there's any way around that: I think that any attempt to make a single MapView object work as either a sequence with a length and indexing AND an iterator with next() and no length and no indexing is doomed to the same problems. Far from minimizing surprise, it will maximise it. Look at how many violations of the Principle Of Least Surprise Greg's MapView has: - If an object has a __len__ method, calling len() on it shouldn't raise TypeError; - If you called len() before, and it succeeded, calling it again should also succeed; - if an object has a __next__ method, it should be an iterator, and that means iter(obj) is obj; - if it isn't an iterator, you shouldn't be able to call next() on it; - if it is an iterator, once it is exhausted, it should stay exhausted; - iterating over an object (calling next() or iter() on it) shouldn't change it from a sequence to a non-sequence; - passing a sequence to another function, shouldn't result in that sequence no longer supporting len() or indexing; - if an object has a length, then it should still have a length even after iterating over it. I may have missed some. -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/