P.J. Eby writes: > This doesn't have to be in the functions; it can be in the > *types*. Mixed-type string operations have to do type checking and > upcasting already, but if the protocol were open, you could make an > encoded-bytes type that would handle the error checking.
Don't you realize that "encoded-bytes" is equivalent to use of a very limited profile of ISO 2022 coding extensions? Such as Emacs/MULE internal encoding or TRON code? It has been tried. It does not work. I understand how types can do such checking; my point is that the encoded-bytes type doesn't have enough information to do it in the cases where you think it is better than converting to str. There are *no useful operations* that can be done on two encoded-bytes with different encodings unless you know the ultimate target codec. The only sensible way to define the concatenation of ('ascii', 'English') with ('euc-jp','ÆüËܸì') is something like ('ascii', 'English', 'euc-jp','ÆüËܸì'), and *not* ('euc-jp','EnglishÆüËܸì'), because you don't know that the ultimate target codec is 'euc-jp'-compatible. Worse, you need to build in all the information about which codecs are mutually compatible into the encoded-bytes type. For example, if the ultimate target is known to be 'shift_jis', it's trivially compatible with 'ascii' and 'euc-jp' requires a conversion, but latin-9 you can't have. > (Btw, in some earlier emails, Stephen, you implied that this could be > fixed with codecs -- but it can't, because the problem isn't with the > bytes containing invalid Unicode, it's with the Unicode containing > invalid bytes -- i.e., characters that can't be encoded to the > ultimate codec target.) No, the problem is not with the Unicode, it is with the code that allows characters not encodable with the target codec. If you don't have a target codec, there are ascii-safe source codecs, such as 'latin-1' or 'ascii' with surrogateescape, that will work any time that bytes-oriented processing can work. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com