At 8:51 PM +0200 4/13/05, BÁRTHÁZI András wrote:
Hi,

ah! That makes perfect sense, thanks for clarifying matters! :)

Ok, then it seems we need to have a builtin, such that:
  new_builtin(0xE2) ~ new_builtin(0x82) ~ new_builtin(0xAC) eq
  "\xE2\x82\xAC"

I think - conceptually - it cannot be done, because you cannot store a byte in a character string, and ~ is for concatenating character strings, not byte strings. In fact, you can do it, because Pugs' (and as I know Parrot's) internal string representation is UTF-8

Parrot's not UTF-8 internally. It can do UTF-8 if it must, but we prefer not, since UTF-8 sucks in so very many ways.


Parrot's encoding-neutral. You can (or will, when I finish some library code) be able to mix unicode, Latin-3, Shift-JIS, EBCDIC, and EUC-KR string data in a program if you wanted. (Though I'd generally recommend against it)
--
Dan


--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to