Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On 12/11/2018 6:50 PM, Greg Ewing wrote: I'm not necessarily saying this *should* be done, just pointing out that it's a possible strategy for migrating map() from an iterator to a view, if we want to do that. Python has list and list_iterator, tuple and tuple_iterator, set and set_iterator, dict and dict_iterator, range and range_iterator. In 3.0, we could have turned map into a finite sequence analogous to range, and add a new map_iterator. To be completely lazy, such a map would have to restrict input to Sequences. To be compatible with 2.0 map, it would have to use list(iterable) to turn other finite iterables into concrete lists, making it only semi-lazy. Since I am too lazy to write the multi-iterable version, here is the one-iterable version to show the idea. def __init__(func, iterable): self.func = func self.seq = iterable if isinstance(iterable, Sequence) else list(iterable) Given the apparent little need for the extra complication, and the possibility of keeping a reference to sequences and explicitly applying list otherwise, it was decided to rebind 'map' to the fully lazy and general itertools.map. -- Terry Jan Reedy ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On Wed, Dec 12, 2018 at 11:31:03AM +1300, Greg Ewing wrote: > Steven D'Aprano wrote: > >I suggest we provide a separate mapview() type that offers only the lazy > >sequence API, without trying to be an iterator at the same time. > > Then we would be back to the bad old days of having two functions > that do almost exactly the same thing. They aren't "almost exactly the same thing". One is a sequence, which is a rich API that includes random access to items and a length; the other is an iterator, which is an intentionally simple API which fails to meet the needs of some users. > My suggestion was made in > the interests of moving the language in the direction of having > less warts, rather than adding more or moving the existing ones > around. > > I acknowledge that the dual interface is itself a bit wartish, It's a "bit wartish" in the same way that the sun is "a bit warmish". > but it's purely for backwards compatibility And it fails at that too. x = map(str.upper, "abcd") x is iter(x) returns True with the current map, an actual iterator, and False with your hybrid. Current map() is a proper, non-broken iterator; your hybrid is a broken iterator. (That's not me being derogative: its the official term for iterators which don't stay exhausted.) I'd be more charitable if I thought the flaws were mere bugs that could be fixed. But I don't think there is any way to combine two incompatible interfaces, the sequence and iterator APIs, into one object without these sorts of breakages. Take the __next__ method out of your object, and it is a better version of what I proposed earlier. With the __next__ method, its just broken. -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
Steven D'Aprano wrote: The iterator protocol is that iterators must: - have a __next__ method; - have an __iter__ method which returns self; and the test for an iterator is: obj is iter(obj) By that test, it identifies as a sequence, as does testing it for the presence of __len__: >>> m is iter(m) False >>> hasattr(m, '__len__') True So, code that doesn't know whether it has a sequence or iterator and tries to find out, will conclude that it has a sequence. Presumably it will then proceed to treat it as a sequence, which will work fine. py> x = MapView(str.upper, "abcdef") # An imposter. py> next(x) 'A' py> next(x) 'B' py> next(iter(x)) 'A' That's a valid point, but it can be fixed: def __iter__(self): return self.iterator or map(self.func, *self.args) Now it gives >>> next(x) 'A' >>> list(x) [] There is still one case that will behave differently from the current map(), i.e. using list() first and then expecting it to behave like an exhausted iterator. I'm finding it hard to imagine real code that would depend on that behaviour, though. > whether operations succeed or not depend on the order that you call them: py> x = MapView(str.upper, "abcdef") py> len(x)*next(x) # Safe. But only ONCE. But what sane code is going to do that? Remember, the iterator interface is only there for backwards compatibility. That would fail under both Python 2 and the current Python 3. py> def innocent_looking_function(obj): ... next(obj) ... py> x = MapView(str.upper, "abcdef") py> len(x) 6 py> innocent_looking_function(x) py> len(x) TypeError: Mapping iterator has no len() If you're using len(), you clearly expect to have a sequence, not an iterator, so why are you calling a function that blindly expects an iterator? Again, this cannot be and could never have been working code. I presume this is just an oversight, but indexing continues to work even when len() has been broken. That could be fixed. This MapView class offers a hybrid "sequence plus iterator, together at last!" double-headed API, and even its creator says that sane code shouldn't use that API. No. I would document it like this: It provides a sequence API. It also, *for backwards compatibility*, implements some parts of the iterator API, but new code should not rely on that, nor should any code expect to be able to use both interfaces on the same object. The backwards compatibility would not be perfect, but I think it would work in the vast majority of cases. I also envisage that the backwards compatibility provisions would not be kept forever, and that it would eventually become a pure sequence object. I'm not necessarily saying this *should* be done, just pointing out that it's a possible strategy for migrating map() from an iterator to a view, if we want to do that. -- Greg ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On Tue, Dec 11, 2018 at 11:10 AM Terry Reedy wrote: > > I _think_ someone may be advocating that map() could return an > > iterable if it is passed a iterable, > > I believe you mean 'iterator' rather than 'iterable' here and below as a > sequence is an iterable. > well, the iterator / iterable distinction is important in this thread in many places, so I should have been more careful about that -- but not for this reason. Yes, a a sequence is an iterable, but what I meant was an "iterable-that-is-not-a-sequence". -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
Steven D'Aprano wrote: I suggest we provide a separate mapview() type that offers only the lazy sequence API, without trying to be an iterator at the same time. Then we would be back to the bad old days of having two functions that do almost exactly the same thing. My suggestion was made in the interests of moving the language in the direction of having less warts, rather than adding more or moving the existing ones around. I acknowledge that the dual interface is itself a bit wartish, but it's purely for backwards compatibility, so it could be deprecated and eventually removed if desired. -- Greg ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On 12/11/2018 12:01 PM, Chris Barker - NOAA Federal via Python-ideas wrote: Perhaps I got confused by the early part of this discussion. My point was that there is no “map-like” object at the Python level. (That is no Map abc). Py2’s map produced a sequence. Py3’s map produced an iterable. So any API that was expecting a sequence could accept the result of a py2 map, but not a py3 map. There is absolutely nothing special about map here. The example of range has been brought up, but I don’t think it’s analogous — py2 range returns a list, py3 range returns an immutable sequence. Because that’s as close as we can get to a sequence while preserving the lazy evaluation that is wanted. I _think_ someone may be advocating that map() could return an iterable if it is passed a iterable, I believe you mean 'iterator' rather than 'iterable' here and below as a sequence is an iterable. and a sequence of it is passed a sequence. Yes, it could, but that seems like a bad idea to me. But folks are proposing a “map” that would produce a lazy-evaluated sequence. Sure — as Paul said, put it up on pypi and see if folks find it useful. Personally, I’m still finding it hard to imagine a use case where you need the sequence features, but also lazy evaluation is important. Sure: range() has that, but it came at almost zero cost, and I’m not sure the sequence features are used much. Note: the one use-case I can think of for a lazy evaluated sequence instead of an iterable is so that I can pick a random element with random.choice(). (Try to pick a random item from. a dict), but that doesn’t apply here—pick a random item from the source sequence instead. But this is specific example of a general use case: you need to access only a subset of the mapped sequence (or access it out of order) so using the iterable version won’t work, and it may be large enough that making a new sequence is too resource intensive. Seems rare to me, and in many cases, you could do the subsetting before applying the function, so I think it’s a pretty rare use case. But go ahead and make it — I’ve been wrong before :-) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On 12/11/2018 6:48 AM, E. Madison Bray wrote: The idea would be to now enhance the existing built-ins to restore at least some previously lost assumptions, at least in the relevant cases. To give an analogy, Python 3.0 replaced range() with (effectively) xrange(). This broken a lot of assumptions that the object returned by range(N) would work much like a list, A range represents an arithmetic sequence. Any usage of range that could be replaced by xrange, which is nearly all uses, made no assumption broken by xrange. The basic assumption was and is that a range/xrange could be repeatedly iterated. That this assumption was met in the first case by returning a list was somewhat of an implementation detail. In terms of mutability, a tuple would be have been better, as range objects should not be mutable. (If [2,4,6] is mutated to [2,3,7], it is no longer a range (arithmetic sequence). and Python 3.2 restored some of that list-like functionality As I see it, xranges were unfinished as sequence objects and 3.2 finished the job. This included having the min() and max() builtins calculate the min and max efficiently, as a human would, as the first or last of the sequence, rather than uselessly iterating and comparing all the items in the sequence. A proper analogy to range would be a re-iterable mapview (or 'mapseq) like what Steven D'Aprano proposes. ** I have a separate complaint that there's no great way, at the Python level, to define a class that is explicitly a "sequence" as opposed to a more general "mapping", You mean like this? >>> from collections.abc import Sequence as S >>> isinstance((), S) True >>> isinstance([], S) True >>> isinstance(range(5), S) True >>> isinstance({}, S) False >>> isinstance(set(), S) False >>> class NItems(S): def __init__(self, n, item): self.len = n self.item = item def __getitem__(self, i): # missing index check return self.item def __len__(self): >>> isinstance(NItems(2, 3), S) True -- Terry Jan Reedy ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] __len__() for map()
On 12/1/2018 2:08 PM, Steven D'Aprano wrote: This proof of concept wrapper class could have been written any time since Python 1.5 or earlier: class lazymap: def __init__(self, function, sequence): One could now add at the top of the file from collections.abc import Sequence and here if not isinstance(sequence, Sequence): raise TypeError(f'{sequence} is not a sequence') self.function = function self.wrapped = sequence def __len__(self): return len(self.wrapped) def __getitem__(self, item): return self.function(self.wrapped[item]) For 3.x, I would add def __iter__: return map(self.function, self.sequence) but your point that iteration is possible even without, with the old protocol, is well made. It is fully iterable using the sequence protocol, even in Python 3: py> x = lazymap(str.upper, 'aardvark') py> list(x) ['A', 'A', 'R', 'D', 'V', 'A', 'R', 'K'] Mapped items are computed on demand, not up front. It doesn't make a copy of the underlying sequence, it can be iterated over and over again, it has a length and random access. And if you want an iterator, you can just pass it to the iter() function. There are probably bells and whistles that can be added (a nicer repr? any other sequence methods? a cache?) and I haven't tested it fully. For backwards compatibilty reasons, we can't just make map() work like this, because that's a change in behaviour. There may be tricky corner cases I haven't considered, but as a proof of concept I think it shows that the basic premise is sound and worth pursuing. -- Terry Jan Reedy ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On 12/1/2018 8:07 PM, Greg Ewing wrote: Steven D'Aprano wrote: After defining a separate iterable mapview sequence class For backwards compatibilty reasons, we can't just make map() work like this, because that's a change in behaviour. Actually, I think it's possible to get the best of both worlds. I presume you mean the '(iterable) sequence' 'iterator' worlds. I don't think they should be mixed. A sequence is reiterable, an iterator is once through and done. Consider this: from operator import itemgetter class MapView: def __init__(self, func, *args): self.func = func self.args = args self.iterator = None def __len__(self): return min(map(len, self.args)) def __getitem__(self, i): return self.func(*list(map(itemgetter(i), self.args))) def __iter__(self): return self def __next__(self): if not self.iterator: self.iterator = map(self.func, *self.args) return next(self.iterator) The last two (unnecessarily) restrict this to being a once through iterator. I think much better would be def __iter__: return map(self.func, *self.args) -- Terry Jan Reedy ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
Perhaps I got confused by the early part of this discussion. My point was that there is no “map-like” object at the Python level. (That is no Map abc). Py2’s map produced a sequence. Py3’s map produced an iterable. So any API that was expecting a sequence could accept the result of a py2 map, but not a py3 map. There is absolutely nothing special about map here. The example of range has been brought up, but I don’t think it’s analogous — py2 range returns a list, py3 range returns an immutable sequence. Because that’s as close as we can get to a sequence while preserving the lazy evaluation that is wanted. I _think_ someone may be advocating that map() could return an iterable if it is passed a iterable, and a sequence of it is passed a sequence. Yes, it could, but that seems like a bad idea to me. But folks are proposing a “map” that would produce a lazy-evaluated sequence. Sure — as Paul said, put it up on pypi and see if folks find it useful. Personally, I’m still finding it hard to imagine a use case where you need the sequence features, but also lazy evaluation is important. Sure: range() has that, but it came at almost zero cost, and I’m not sure the sequence features are used much. Note: the one use-case I can think of for a lazy evaluated sequence instead of an iterable is so that I can pick a random element with random.choice(). (Try to pick a random item from. a dict), but that doesn’t apply here—pick a random item from the source sequence instead. But this is specific example of a general use case: you need to access only a subset of the mapped sequence (or access it out of order) so using the iterable version won’t work, and it may be large enough that making a new sequence is too resource intensive. Seems rare to me, and in many cases, you could do the subsetting before applying the function, so I think it’s a pretty rare use case. But go ahead and make it — I’ve been wrong before :-) -CHB Sent from my iPhone > On Dec 11, 2018, at 6:47 AM, Steven D'Aprano wrote: > >> On Mon, Dec 10, 2018 at 05:15:36PM -0800, Chris Barker via Python-ideas >> wrote: >> [...] >> I'm still confused -- what's so wrong with: >> >> list(map(func, some_iterable)) >> >> if you need a sequence? > > You might need a sequence. Why do you think that has to be an *eager* > sequence? > > I can think of two obvious problems with eager sequences: space and > time. They can use too much memory, and they can take too much time to > generate them up-front and too much time to reap when they become > garbage. And if you have an eager sequence, and all you want is the > first item, you still have to generate all of them even though they > aren't needed. > > We can afford to be profligate with memory when the data is small, but > eventually you run into cases where having two copies of the data is one > copy too many. > > >> You can, of course mike lazy-evaluated sequences (like range), and so you >> could make a map-like function that required a sequence as input, and would >> lazy evaluate that sequence. This could be useful if you weren't going to >> work with the entire collection, > > Or even if you *are* going to work with the entire collection, but you > don't need them all at once. I once knew a guy whose fondest dream was > to try the native cuisine of every nation of the world ... but not all > in one meal. > > This is a classic time/space tradeoff: for the cost of calling the > mapping function anew each time we index the sequence, we can avoid > allocating a potentially huge list and calling a potentially expensive > function up front for items we're never going to use. Instead, we call > it only on demand. > > These are the same principles that justify (x)range and dict views. Why > eagerly generate a list up front, if you only need the values one at a > time on demand? Why make a copy of the dict keys, if you don't need a > copy? These are not rhetorical questions. > > This is about avoiding the need to make unnecessary copies for those > times we *don't* need an eager sequence generated up front, keeping the > laziness of iterators and the random-access of sequences. > > map(func, sequence) is a great candidate for this approach. It has to > hold onto a reference to the sequence even as an iterator. The function > is typically side-effect free (a pure function), and if it isn't, > "consenting adults" applies. We've already been told there's at least > one major Python project, Sage, where this would have been useful. > > There's a major functional language, Haskell, where nearly all sequence > processing follows this approach. > > I suggest we provide a separate mapview() type that offers only the lazy > sequence API, without trying to be an iterator at the same time. If you > want an eager sequence, or an iterator, they're only a single function > call away: > >list(mapview_instance) >iter(mapview_instance) # or just stick to map() > > Rather than trying to guess whether people want to tr
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On Tue, Dec 11, 2018 at 12:48:10PM +0100, E. Madison Bray wrote: > Right now I'm specifically responding to the sub-thread that Greg > started "Suggested MapView object", so I'm considering this a mostly > clean slate from the previous thread "__len__() for map()". Different > ideas have been tossed around and the discussion has me thinking about > broader possibilities. I responded to this thread because I liked > Greg's proposal and the direction he's suggesting. Greg's code can be found here: https://mail.python.org/pipermail/python-ideas/2018-December/054659.html His MapView tries to be both an iterator and a sequence at the same time, but it is neither. The iterator protocol is that iterators must: - have a __next__ method; - have an __iter__ method which returns self; and the test for an iterator is: obj is iter(obj) https://docs.python.org/3/library/stdtypes.html#iterator-types Greg's MapView object is an *iterable* with a __next__ method, which makes it neither a sequence nor a iterator, but a hybrid that will surprise people who expect it to act considently as either. This is how iterators work: py> x = iter("abcdef") # An actual iterator. py> next(x) 'a' py> next(x) 'b' py> next(iter(x)) 'c' Greg's hybrid violates that expected behaviour: py> x = MapView(str.upper, "abcdef") # An imposter. py> next(x) 'A' py> next(x) 'B' py> next(iter(x)) 'A' As an iterator, it is officially "broken", continuing to yield values even after it is exhausted: py> x = MapView(str.upper, 'a') py> next(x) 'A' py> next(x) Traceback (most recent call last): File "", line 1, in File "/home/steve/gregmapview.py", line 24, in __next__ return next(self.iterator) StopIteration py> list(x) # But wait! There's more! ['A'] py> list(x) # And even more! ['A'] This hybrid is fragile: whether operations succeed or not depend on the order that you call them: py> x = MapView(str.upper, "abcdef") py> len(x)*next(x) # Safe. But only ONCE. 'AA' py> y = MapView(str.upper, "uvwxyz") py> next(y)*len(y) # Looks safe. But isn't. Traceback (most recent call last): File "", line 1, in File "/home/steve/gregmapview.py", line 12, in __len__ raise TypeError("Mapping iterator has no len()") TypeError: Mapping iterator has no len() (For brevity, from this point on I shall trim the tracebacks and show only the final error message.) Things that work once, don't work a second time. py> len(x)*next(x) # Worked a moment ago, but now it is broken. TypeError: Mapping iterator has no len() If you pass your MapView object to another function, it can accidentally sabotage your code: py> def innocent_looking_function(obj): ... next(obj) ... py> x = MapView(str.upper, "abcdef") py> len(x) 6 py> innocent_looking_function(x) py> len(x) TypeError: Mapping iterator has no len() I presume this is just an oversight, but indexing continues to work even when len() has been broken. Greg seems to want to blame the unwitting coder who runs into these boobytraps: "But there are no surprises as long as you stick to one interface or the other. Weird things happen if you mix them up, but sane code won't be doing that." (URL as above). This MapView class offers a hybrid "sequence plus iterator, together at last!" double-headed API, and even its creator says that sane code shouldn't use that API. Unfortunately, you can't use the iterator API, because its broken as an iterator, and you can't use it as a sequence, because any function you pass it to might use it as an iterator and pull the rug out from under your feet. Greg's code is, apart from the addition of the __next__ method, almost identical to the version of mapview I came up with in my own testing. Except Greg's is even better, since I didn't bother handling the multiple-sequences case and his does. Its the __next__ method which ruins it, by trying to graft on almost- but-not-really iterator behaviour onto something which otherwise is a sequence. I don't think there's any way around that: I think that any attempt to make a single MapView object work as either a sequence with a length and indexing AND an iterator with next() and no length and no indexing is doomed to the same problems. Far from minimizing surprise, it will maximise it. Look at how many violations of the Principle Of Least Surprise Greg's MapView has: - If an object has a __len__ method, calling len() on it shouldn't raise TypeError; - If you called len() before, and it succeeded, calling it again should also succeed; - if an object has a __next__ method, it should be an iterator, and that means iter(obj) is obj; - if it isn't an iterator, you shouldn't be able to call next() on it; - if it is an iterator, once it is exhausted, it should stay exhausted; - iterating over an object (calling next() or iter() on it) shouldn't change it from a sequence to a non-sequence; - passing a sequence to another function, shouldn't resu
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On Mon, Dec 10, 2018 at 05:15:36PM -0800, Chris Barker via Python-ideas wrote: [...] > I'm still confused -- what's so wrong with: > > list(map(func, some_iterable)) > > if you need a sequence? You might need a sequence. Why do you think that has to be an *eager* sequence? I can think of two obvious problems with eager sequences: space and time. They can use too much memory, and they can take too much time to generate them up-front and too much time to reap when they become garbage. And if you have an eager sequence, and all you want is the first item, you still have to generate all of them even though they aren't needed. We can afford to be profligate with memory when the data is small, but eventually you run into cases where having two copies of the data is one copy too many. > You can, of course mike lazy-evaluated sequences (like range), and so you > could make a map-like function that required a sequence as input, and would > lazy evaluate that sequence. This could be useful if you weren't going to > work with the entire collection, Or even if you *are* going to work with the entire collection, but you don't need them all at once. I once knew a guy whose fondest dream was to try the native cuisine of every nation of the world ... but not all in one meal. This is a classic time/space tradeoff: for the cost of calling the mapping function anew each time we index the sequence, we can avoid allocating a potentially huge list and calling a potentially expensive function up front for items we're never going to use. Instead, we call it only on demand. These are the same principles that justify (x)range and dict views. Why eagerly generate a list up front, if you only need the values one at a time on demand? Why make a copy of the dict keys, if you don't need a copy? These are not rhetorical questions. This is about avoiding the need to make unnecessary copies for those times we *don't* need an eager sequence generated up front, keeping the laziness of iterators and the random-access of sequences. map(func, sequence) is a great candidate for this approach. It has to hold onto a reference to the sequence even as an iterator. The function is typically side-effect free (a pure function), and if it isn't, "consenting adults" applies. We've already been told there's at least one major Python project, Sage, where this would have been useful. There's a major functional language, Haskell, where nearly all sequence processing follows this approach. I suggest we provide a separate mapview() type that offers only the lazy sequence API, without trying to be an iterator at the same time. If you want an eager sequence, or an iterator, they're only a single function call away: list(mapview_instance) iter(mapview_instance) # or just stick to map() Rather than trying to guess whether people want to treat their map objects as sequences or iterators, we let them choose which they want and be explicit about it. Consider the history of dict.keys(), values() and items() in Python 2. Originally they returned eager lists. Did we try to retrofit view-like and iterator-like behaviour onto the existing dict.keys() method, returning a cunning object which somehow turned from a list to a view to an iterator as needed? Hell no! We introduced *six new methods* on dicts: - dict.iterkeys() - dict.viewkeys() and similar for items() and values(). Compared to that, adding a single variant on map() that expects a sequence and returns a view on the sequence seems rather timid. -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On Tue, 11 Dec 2018 at 11:49, E. Madison Bray wrote: > The idea would be to now enhance the existing built-ins to restore at > least some previously lost assumptions, at least in the relevant > cases. To give an analogy, Python 3.0 replaced range() with > (effectively) xrange(). This broken a lot of assumptions that the > object returned by range(N) would work much like a list, and Python > 3.2 restored some of that list-like functionality by adding support > for slicing and negative indexing on range(N). I believe it's worth > considering such enhancements for filter() and map() as well, though > these are obviously a bit trickier. Thanks. That clarifies the situation for me very well. I agree with most of the comments you made, although I don't have any good answers. I think you're probably right that Guido's original idea to move map and filter to functools might have been better, forcing users to explicitly choose between a genexp and a list comprehension. On the other hand, it might have meant people used more lists than they needed to, as a result. Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On Tue, Dec 11, 2018 at 12:13 PM Paul Moore wrote: > > On Tue, 11 Dec 2018 at 10:38, E. Madison Bray wrote: > > I don't understand why this is confusing. > [...] > > For something like a fixed sequence a "map" could just as easily be > > defined as a pair (, ) that applies , > > which I'm claiming is a pure function, to every element returned by > > the . This transformation can be applied lazily on a > > per-element basis whether I'm iterating over it, or performing random > > access (since is known for all N). > > What's confusing to *me*, at least, is what's actually being suggested > here. There's a lot of theoretical discussion, but I've lost track of > how it's grounded in reality: It's true, this has been a wide-ranging discussion and it's confusing. Right now I'm specifically responding to the sub-thread that Greg started "Suggested MapView object", so I'm considering this a mostly clean slate from the previous thread "__len__() for map()". Different ideas have been tossed around and the discussion has me thinking about broader possibilities. I responded to this thread because I liked Greg's proposal and the direction he's suggesting. I think that the motivation underlying much of this discussion, forth both the OP who started the original thread, as well as myself, and others is that before Python 3 changed the implementation of map() there were certain assumptions one could make about map() called on a list* which, under normal circumstances were quite reasonable and sane (e.g. len(map(func, lst)) == len(lst), or map(func, lst)[N] == func(lst[N])). Python 3 broke all of these assumptions, for reasons that I personally have no disagreement with, in terms of motivation. However, in retrospect, it might have been nice if more consideration were given to backwards compatibility for some "obvious" simple cases. This isn't a Python 2 vs Python 3 whine though: I'm just trying to think about how I might expect map() to work on different types of arguments, and I see no problem--so long as it's properly documented--with making its behavior somewhat polymorphic on the types of arguments. The idea would be to now enhance the existing built-ins to restore at least some previously lost assumptions, at least in the relevant cases. To give an analogy, Python 3.0 replaced range() with (effectively) xrange(). This broken a lot of assumptions that the object returned by range(N) would work much like a list, and Python 3.2 restored some of that list-like functionality by adding support for slicing and negative indexing on range(N). I believe it's worth considering such enhancements for filter() and map() as well, though these are obviously a bit trickier. * or other fixed-length sequence, but let's just use list as a shorthand, and assume for the sake of simplicity a single list as well. > 1. If we're saying that "it would be nice if there were a function > that acted like map but kept references to its arguments", that's easy > to do as a module on PyPI. Go for it - no-one will have any problem > with that. Sure, though since this is about the behavior of global built-ins that are commonly used by users at all experience levels the problem is a bit hairier. Anybody can implement anything they want and put it in a third-party module. That doesn't mean anyone will use it. I still have to write code that handles map objects. In retrospect I think Guido might have had the right idea of wanting to move map() and filter() into functools along with reduce(). There's a surprisingly lot more at stake in terms of backwards compatibility and least-astonishment when it comes to built-ins. I think that's in part why the new Python 3 definitions of map() and filter() were kept so simple: although they were not backwards compatible I do think they were well designed to minimize astonishment. That's why I don't necessarily disagree with the choices made (but still would like to think about how we can make enhancements going forward). > 2. If we're saying "the builtin map needs to behave like that", then > 2a. *Why*? What is so special about this situation that the builtin > has to be changed? Same question could apply to last time it was changed. I think now we're trying to find some middle-ground. > 2b. Compatibility questions need to be addressed. Is this important > enough to code that "needs" it that such code is OK with being Python > 3.8+ only? If not, why aren't the workarounds needed for Python 3.7 > good enough? (Long term improvement and simplification of the code > *is* a sufficient reason here, it's just something that should be > explicit, as it means that the benefits are long-term rather than > immediate). That's a good point: I think the same arguments as for enhancing range() apply here, but this is worth further consideration (though having a more concrete proposal in the first place should come first). > 2c. Weird corner case questions, while still being rare, *do* need > to be
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On Tue, 11 Dec 2018 at 10:38, E. Madison Bray wrote: > I don't understand why this is confusing. [...] > For something like a fixed sequence a "map" could just as easily be > defined as a pair (, ) that applies , > which I'm claiming is a pure function, to every element returned by > the . This transformation can be applied lazily on a > per-element basis whether I'm iterating over it, or performing random > access (since is known for all N). What's confusing to *me*, at least, is what's actually being suggested here. There's a lot of theoretical discussion, but I've lost track of how it's grounded in reality: 1. If we're saying that "it would be nice if there were a function that acted like map but kept references to its arguments", that's easy to do as a module on PyPI. Go for it - no-one will have any problem with that. 2. If we're saying "the builtin map needs to behave like that", then 2a. *Why*? What is so special about this situation that the builtin has to be changed? 2b. Compatibility questions need to be addressed. Is this important enough to code that "needs" it that such code is OK with being Python 3.8+ only? If not, why aren't the workarounds needed for Python 3.7 good enough? (Long term improvement and simplification of the code *is* a sufficient reason here, it's just something that should be explicit, as it means that the benefits are long-term rather than immediate). 2c. Weird corner case questions, while still being rare, *do* need to be addressed - once a certain behaviour is in the stdlib, changing it is a major pain, so we have a responsibility to get even the corner cases right. 2d. It's not actually clear to me how critical that need actually is. Nice to have, sure (you only need a couple of people who would use a feature for it to be "nice to have") but beyond that I haven't seen a huge number of people offering examples of code that would benefit (you mentioned Sage, but that example rapidly degenerated into debates about Sage's design, and while that's a very good reason for not wanting to continue using that as a use case, it does leave us with few actual use cases, and none that I'm aware of that are in production code...) 3. If we're saying something else (your comment "map could just as easily be defined as..." suggests that you might be) then I'm not clear what it is. Can you describe your proposal as pseudo-code, or a Python implementation of the "map" replacement you're proposing? Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggested MapView object (Re: __len__() for map())
On Tue, Dec 11, 2018 at 2:16 AM Chris Barker wrote: > On Mon, Dec 10, 2018 at 5:23 AM E. Madison Bray wrote: >> >> Indeed; I believe it is very useful to have a map-like object that is >> effectively an augmented list/sequence. > > > but what IS a "map-like object" -- I'm trying to imagine what that actually > means. > > "map" takes a function and maps it onto a interable, returning a new > iterable. So a map object is an iterable -- what's under the hood being used > to create it is (and should remain) opaque. I don't understand why this is confusing. Greg gave an example of what this *might* mean up thread. It's not the only possible approach but it is one that makes a lot of sense to me. The way you're defining "map" is arbitrary and post-hoc. It's a definition that makes sense for "map" that's restricted to iterating over arbitrary iterators. It's how it happens to be defined in Python 3 for various reasons that you took time to explain at great length, which I regret to inform you was time wasted explaining things I already know. For something like a fixed sequence a "map" could just as easily be defined as a pair (, ) that applies , which I'm claiming is a pure function, to every element returned by the . This transformation can be applied lazily on a per-element basis whether I'm iterating over it, or performing random access (since is known for all N). Python has no formal notion of a pure function, but I'm an adult and can accept responsibility if I try to use this "map-like" object in a way that is not logically consistent. The stuff about Sage is beside the point. I'm not even talking about that anymore. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/