On Sun, Aug 21, 2016 at 12:52 PM, Steven D'Aprano <st...@pearwood.info> wrote: >> > The fixes overall will be a lot easier and obvious than introduction of >> > unicode as default string type in Python 3.0. >> >> That's a bold claim. Have you considered what's at stake if that's not true? > > Saying that these so-called "fixes" (we haven't established yet that > Python's string behaviour is a bug that need fixing) will be easier and > more obvious than the change to Unicode is not that bold a claim. Pretty > much everything is easier and more obvious than changing to Unicode. :-) > (Possibly not bringing peace to the Middle East.)
And yet it's so simple. We can teach novice programmers about two's complement [1] representations of integers, and they have no trouble comprehending that the abstract concept of "integer" is different from the concrete representation in memory. We can teach intermediate programmers how hash tables work, and how to improve their performance on CPUs with 64-byte cache lines - again, there's no comprehension barrier between "mapping from key to value" and "puddle of bytes in memory that represent that mapping". But so many programmers are entrenched in the thinking that a byte IS a character. > I think that while the suggestion does bring some benefit, the benefit > isn't enough to make up for the code churn and disruption it would > cause. But I encourage the OP to go through the standard library, pick a > couple of modules, and re-write them to see how they would look using > this proposal. Python still has a rule that you can iterate over anything that has __getitem__, and it'll be called with 0, 1, 2, 3... until it raises IndexError. So you have two options: Remove that rule, and require that all iterable objects actually define __iter__; or make strings non-subscriptable, which means you need to do something like "asdf".char_at(0) instead of "asdf"[0]. IMO the second option is a total non-flyer - good luck convincing anyone that THAT is an improvement. The first one is possible, but dramatically broadens the backward-compatibility issue. You'd have to search for any class that defines __getitem__ and not __iter__. If that *does* get considered, it wouldn't be too hard to have a compatibility function, maybe in itertools. def subscript(self): i = 0 try: while "moar indexing": yield self[i] i += 1 except IndexError: pass class Demo: def __getitem__(self, item): ... __iter__ = itertools.subscript But there'd have to be the full search of "what will this break", even before getting as far as making strings non-iterable. ChrisA [1] Not "two's compliment", although I'm told that Two can say some very nice things. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/