[Python-ideas] Re: Enhancing iterator objects with map, filter, reduce methods

Raimi bin Karim Sun, 28 Nov 2021 09:12:51 -0800

Hi Stephen,

Stephen J. Turnbull wrote:
> 1.  Is dataflow/fluent programming distinguishable from whatever it
>     was that Guido didn't like about method chaining idioms?  If so,
>     how?
Are you referring to this 
https://mail.python.org/pipermail/python-dev/2003-October/038855.html?
He mentioned (if I may summarize) (1) familiarity with the API and 
(2) making mistakes. In fluent programming (at least in the implementation
that I suggested for iterators), it is to make mistakes. Eg.


    pipeline([1,2,3]).reduce(lambda a,b: a+b).map(lambda x: x+1)

because some methods reduce therefore returning a non-sequence
object, instead of self.

But with all due respect, you _do_ need to be familiar with the API to use
it, so I don't see why (1) could be an issue. And with familiarity, you would
make fewer mistakes.

> 2.  Is the method chaining syntax preferable to an alternative
>     operator?
I don't have an answer to this. I personally like method chaining. And
this is also because the handful lot of languages that I'm familiar with
use method chaining.

> 3.  Is there an obvious choice for the implementation?  Specifically,
>     there are at least three possibilities:
>     a.  Restrict it to mutable sequences, and do the transformations
>         in place.
>     b.  Construct nested iterators and listify the result only if
>         desired.
>     c.  Both.
The choice for this implementation is to replace chaining function calls
from the itertools module (incl. map and filter):

    list(starmap(..., filter(..., chain(..., map(..., ...))))

with something similar, but read from left to right instead. And because
the functions from itertools module take in any iterable (regardless of
mutability), the implementation should also do the same, which is (b)
in your list.

> 4.  Is this really so tricky that the obvious implementation of the
>     iterator approach (Chris's) needs to be in the stdlib with tons of
>     methods on it, or does it make more sense have applications write
>     one with the specific methods needed for the application?
>       Or perhaps instead of creating a generic class prepopulated with
>     methods, maybe this should be a factory function which takes a
>     collection of mapping functions, and adds them to the dataflow
>     object on the fly?

I think this boils down to the itertools module (was thinking about 
it over the weekend).

I find that the itertools module and some builtins like map, filter 
don't do themselves justice when chained together. It's okay for 
one or two function calls. But the design made it seem like it was 
never meant to be chained together (or was it?). Attempts to do 
so leads to code that must be read from right to left, making it an 
awkward API to use for transforming collections (which most of us 
might agree). 

If it was indeed built for one or two function calls, then I would 
argue that it's not really a useable or practical module, because 
a lot of times we perform not just one or two but multiple 
transformations on collections.

So, to answer this question, I don't think the issue is whether the 
implementation is tricky such that the stdlib should do it. Rather, 
*our* itertools module itself is tricky to use, because fundamentally 
its design is not user-friendly, or rather limiting to the users. And this 
is a problem. Head over to StackOverflow and most people wouldn't 
recommend using it. It's not well-liked (except maybe by Lisp-ers). 
It's most probably because of what I mentioned in the previous 
paragraph.

What does this mean for us? I think it's a good opportunity for us 
to rethink the design to make it more usable. Hence, I'm putting 
the onus on us (stdlib), instead of relying on 3rd party libraries to 
improve on it. 

As a proposal to improve the design, I suggested above a higher-
level API for the itertools module that says "oh you want to use the 
itertools module? yeah it's a low-level module that is not meant to 
be used directly so here's a higher level API you can use instead."
The implementation doesn't have to be method chaining because 
I'm generally proposing a higher-level API.

Now, I've said that the useability of the itertools module is a problem 
pretty much in a matter-of-fact manner and putting it on us to rework
it. But what does everyone else think about this? Do you share the 
same concerns too?

> 5.  map() and zip() take multiple iterables.  Should this feature
>     handle those cases?  Note that the factory function approach
>     allows the client to make this decision for themselves.
I would say nope for map and yes for zip, viewing it from the perspective
of the underlying iterator. The .map() instance method only refers to the 
underlying iterator so it should only take a function that will transform every
element in the underlying iterator. For zip, we can take multiple iterables 
because we are zipping them with the underlying iterator. But this is my 
opinion and is a detail that we can come to a consensus to later.

> 6.  What are the names that you propose for the APIs?  They need to
>     indicate the implementations since there are various ways to
>     implement.
I propose the names be similar to those in builtin + itertools

    map (map_every to indicate a different implementation? though not 
    conventional), filter, reduce, starmap, starfilter, zip, enumerate

some from the Itertools Recipes section that might be more common:

    flatten, nth, take

some 'reductional' ones:

    reduce, sum, all, any, min, max, join (for string iterators)

some hybrid

    flat_map, filter_map

some which 

    for_each (returning None, though this is a for-loop).
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BKX4ZYPRJQU7Y6WB43ZR3NDVBZNYRQ4I/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Enhancing iterator objects with map, filter, reduce methods

Reply via email to