Chris Angelico <ros...@gmail.com>: > On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa <ma...@pacujo.net> wrote: >> When people use Unicode, they are expecting to be able to deal in real >> characters. I would expect: >> >> len(text) to give me the length in characters >> text[-1] to evaluate to the last character >> re.match("a.c", text) to match a character between a and c >> >> So the question is, should we have a third type for text. Or should the >> semantics of strings be changed to be based on characters? > > What is the length of a string? How often do you actually care about > the number of grapheme clusters - and not, for example, about the > pixel width?
A good question. I have in the past argued that the string should be a special data type for the specialist text processing needs. However, I happen to have fooled around with a character-graphics based game in recent days, and even professionally, I use character-based alignment quite often. Consider, for example, a Python source code editor where you want to limit the length of the line based on the number of characters more typically than based on the number of pixels. Furthermore, you only dismissed my question about len(text) What about text[-1] re.match("a.c", text) Marko -- https://mail.python.org/mailman/listinfo/python-list