Probably the most proliferate reason it made things *worse* is that many
functions that can take collections as arguments--in fact probably
most--were never written to accept arbitrary iterables in the first place.
Perhaps they should have been, but the majority of that was before my time
so I and others who worked on the Python 3 port were stuck with that.

Sure the fix is simple enough: check if the object is iterable (itself not
always as simple as one might assume) and then call list() on it. But we're
talking thousands upon thousands of functions that need to be updated where
examples involving map previously would have just worked.

But on top of the obvious workarounds I would now like to do things like
protect users, where possible, from doing things like passing arbitrarily
sized data to relatively flimsy C libraries, or as I mentioned in my last
message make new optimizations that weren't possible before.

Of course this isn't always possible in some cases where dealing with an
arbitrary opaque iterator, or some pathological cases. But I'm concerned
more about doing the best we can in the most common cases (lists, tuples,
vectors, etc) which are *vastly* more common.

I use SageMath as an example but I'm sure others could come up with their
own clever use cases. I know there are other cases where I've wanted to at
least try to get the len of a map, at least in cases where it was
unambiguous (for example making a progress meter or something)

On Wed, Nov 28, 2018, 16:33 Steven D'Aprano <st...@pearwood.info wrote:

> On Wed, Nov 28, 2018 at 04:14:24PM +0100, E. Madison Bray wrote:
>
> > For example, some function that used to expect some finite-sized
> > sequence such as a list or tuple is now passed a "map", possibly
> > wrapping one or more iterable of arbitrary, possibly non-finite size.
> > For the purposes of some algorithm I have this is not useful and I
> > need to convert it to a sequence anyways but don't want to do that
> > without some guarantee that I won't blow up the user's memory usage.
> > So I might want to check:
> >
> > finite_definite = True
> > for it in my_map.iters:
> >     try:
> >         len(it)
> >     except TypeError:
> >         finite_definite = False
> >
> > if finite_definite:
> >     my_seq = list(my_map)
> > else:
> >     # some other algorithm
>
> But surely you didn't need to do this just because of *map*. Users could
> have passed an infinite, unsized iterable going back to Python 1 days
> with the old sequence protocol. They certainly could pass a generator or
> other opaque iterator apart from map. So I'm having trouble seeing why
> the Python 2/3 change to map made things worse for SageMath.
>
> But in any case, this example comes back to the question of len again,
> and we've already covered why this is problematic. In case you missed
> it, let's take a toy example which demonstrates the problem:
>
>
> def mean(it):
>     if isinstance(it, map):
>         # Hypothetical attribute access to the underlying iterable.
>         n = len(it.iterable)
>         return sum(it)/n
>
>
> Now let's pass a map object to it:
>
> data = [1, 2, 3, 4, 5]
> it = map(lambda x: x, data)
> assert len(it.iterable) == 5
> next(it); next(it); next(it)
>
> assert mean(it) == 4.5
> # fails, as it actually returns 9/5 instead of 9/2
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to