Re: [Python-ideas] + operator on generators

Steven D'Aprano Sun, 25 Jun 2017 20:24:27 -0700

On Sun, Jun 25, 2017 at 02:06:54PM +0200, lucas via Python-ideas wrote:

> What about providing something like the following:
> 
>     a = (n for n in range(2))
>     b = (n for n in range(2, 4))
>     tuple(a + b)  # -> 0 1 2 3


As Serhiy points out, this is going to conflict with existing use of + 
operator for string and sequence concatenation.

I have a counter-proposal: introduce the iterator chaining operator "&":

    iterable & iterable --> itertools.chain(iterable, iterable)

The reason I choose & rather than + is that & is less likely to conflict 
with any existing string/sequence types. None of the built-in or std lib 
sequences that I can think of support the & operator.

Also, & is used for (string?) concatenation in some languages, such as 
VB.Net, some BASIC dialects, Hypertalk, AppleScript, and Ada. Iterator 
chaining is more like concatenation than (numeric) addition.

However, the & operator is already used for bitwise-AND. Under my 
proposal that behaviour will continue, and will take priority over 
chaining. Currently, the & operator does something similar to (but 
significantly more complex) to this:


# simplified pseudo-code of existing behaviour
if hasattr(x, '__and__'):
    return x.__and__(y)
elif hasattr(y, '__rand__'):
    return y.__rand__(x)
else:
    raise TypeError


The key is to insert the new behaviour after the existing __(r)and__ 
code, just before TypeError is raised:


attempt existing __(r)and__ behaviour
if and only if that fails to apply:
return itertools.chain(iter(x), iter(y))


So classes that define a __(r)and__ method will keep their existing 
behaviour.

This implies that we cannot use & to chain sets and frozen sets, since 
they already define __(r)and__. This has an easy work-around: just call 
iter() on the set first.

Applying & to objects which don't define __(r)and__ and aren't iterable 
will continue to raise TypeError, just as it does now. The only 
backwards incompatibility this proposal introduces is for any code which 
relies on `iterable & iterable` to raise TypeError. Frankly I can't 
imagine that there is any such code, outside of the Python test suite, 
but if there is, and people think it is worth it, we could make this a 
__future__ import. But I think that's overkill.

The downside to this proposal is that it adds some conceptual complexity 
to Python operators. Apart from `is` and `is not`, all Python operators 
call one or more dunder methods. This is (as far as I know) the first 
operator which has fall-back functionality if the dunder methods aren't 
defined.

Up to now, I've talked about & chaining being equivalent to the 
itertools.chain function. That glosses over one difference which 
needs to be mentioned. The chain function currently doesn't attempt 
to iterate over its arguments until needed:


py> x = itertools.chain("a", 1, "c")
py> next(x)
'a'
py> next(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable


Any proposal to change this behaviour for the itertools.chain 
function should be kept separate from this one.

But for the & chaining operator, I think that behaviour must change: if 
we have an operand that is neither iterable nor defines __(r)and__, the 
& operator should fail early:

[1, 2, 3] & None

should raise TypeError immediately, unlike itertools.chain().



-- 
Steve
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] + operator on generators

Reply via email to