>I agree, but using string(8bit) to mean "binary data" is something >that's 100% backward compatible.
It would not be backwards compatible, since that is not what string(8bit) means today. >Unicode text would always be referred >to as string(21bit), even if it happens to contain nothing but Latin-1 >characters. That doesn't really make sense. So you say that "R\xe4ksm\xf6rg\xe5s" would have type string(21bit)? What type would "\U12345678" have? What type would "Foo" have? How would you specify a UTF-8 encoded literal?