Steven D'Aprano wrote:
> On Sun, Feb 23, 2020 at 08:51:55PM -0000, Steve Jorgensen wrote:
> Python has not just the "iterator protocol" using __iter__ and 
> __next__, but also has an older sequence protocol used in Python 1.x 
> which still exists to this day. This sequence protocol falls back on 
> repeated indexing.
> Conceptually, we should be able to reason that every object that 
> supports indexing should be iterable, without adding a special case 
> exception "...except for str".

Strings already have an exception in this area. Usually `x in y` means `any(x 
== elem for elem in y)`. It makes the two meanings of `in` match, and to me (I 
don't know if this is true) it's the reason that iterating over dictionaries 
yields the keys, although personally I'd find it more convenient if it yielded 
the items. But `in` means something else for strings. It's not as strong a rule 
as the link between iteration and indexing, but it is a break in tradition.

Another somewhat related example: we usually accept that basically every object 
can be treated as a boolean, even more so of it has a `__len__`. But numpy and 
pandas break this 'rule' by raising an exception if you try to treat an array 
as a boolean, e.g:

ValueError: The truth value of an array with more than one element is 
ambiguous. 
Use a.any() or a.all()

In a sense these libraries decided that while unambiguous behaviour could be 
defined, the intention of the user would always be ambiguous. The same could be 
said for strings. Needing to iterate over a string is simply not common unless 
you're writing something like a parser. So even though the behaviour is well 
defined and documented, when someone tries to iterate over a string, 
statistically we can say there's a good chance that's not what they actually 
want. And in the face of ambiguity, refuse the temptation to guess.

I do think it would be a pity if strings broke the tradition of indexable 
implies iterable, but "A Foolish Consistency is the Hobgoblin of Little Minds". 
The benefits in helping users when debugging would outweigh the inconsistency 
and the minor inconvenience of adding a few characters. Users who are expecting 
iteration to work because indexing works will quickly get a helpful error 
message and fix their problem. At the risk of overusing classic Python sayings, 
Explicit is better than implicit.

However, we could get the benefit of making debugging easier without having to 
actually *break* any existing code if we just raised a warning whenever someone 
iterates over a string. It doesn't have to be a deprecation warning and we 
don't need to ever actually make strings non-iterable.

I'm out of time, so I'll just quickly say that I prefer `.chars` as a property 
without the `()`. And jdveiga you asked what would be the advantage of all this 
after I made my previous post about it biting beginners, I'm not sure if you 
missed that or you were just typing yours when I made mine.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KSFGFI4JELDAN3NWJI374GICNWNUBDHX/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to