Steve D'Aprano <steve+pyt...@pearwood.info>: > These are only a *few* of the *easy* questions that need to be > answered before we can even consider your question: > >> So the question is, should we have a third type for text. Or should >> the semantics of strings be changed to be based on characters?
Sure, but if they can't be answered, what good is there in having strings (as opposed to bytes). What problem do strings solve? What operation depends on (or is made simpler) by having strings (instead of bytes)? We are not even talking about some exotic languages, but the problem is right there in the middle of Latin-1. We can't even say what len("è") should return. And we may experience: >>> ord("è")Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: ord() expected a character, but string of length 2 found Of course, UTF-8 in a bytes object doesn't make the situation any better, but does it make it any worse? As it stands, we have è --[encode>-- Unicode --[reencode>-- UTF-8 Why is one encoding format better than the other? Marko -- https://mail.python.org/mailman/listinfo/python-list