Hi,
ah! That makes perfect sense, thanks for clarifying matters! :)
Ok, then it seems we need to have a builtin, such that: new_builtin(0xE2) ~ new_builtin(0x82) ~ new_builtin(0xAC) eq "\xE2\x82\xAC"
I think - conceptually - it cannot be done, because you cannot store a byte in a character string, and ~ is for concatenating character strings, not byte strings. In fact, you can do it, because Pugs' (and as I know Parrot's) internal string representation is UTF-8
Parrot's not UTF-8 internally. It can do UTF-8 if it must, but we prefer not, since UTF-8 sucks in so very many ways.
Parrot's encoding-neutral. You can (or will, when I finish some library code) be able to mix unicode, Latin-3, Shift-JIS, EBCDIC, and EUC-KR string data in a program if you wanted. (Though I'd generally recommend against it)
--
Dan
--------------------------------------it's like this------------------- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk