Re: Char encoding

Simon Cozens Mon, 14 Aug 2000 19:21:04 -0700
(I'm not really following Perl 6, but Unicode is obviously something
I have a concern about. Please *do* CC me replies, just this once.)

On Sat, Aug 05, 2000 at 11:16:46AM +0000, Nick Ing-Simmons wrote:
> Agreed - but that is due to grafting it in late - and possibly 
> trying to be too clever intuiting whether existing perl5-code is 
> working on bytes or chars.

This is why we should:
    i)   Make the choice of internal encoding (UTF-8/UTF-16/UTF-32) decidable 
at compile time.
    ii)  Deal with strings internally through pluggable support routines.
    iii) Never assume bytes.
    iv)  Provide the user a method of converting their input and output to and
from the UTF Perl uses.

> But the goal was to avoid a 100Mbyte ASCII "string" becoming a 400Mbyte
> UTF32 "string" with 300Mbytes of 0x000000.

Hey, if the user wants it, the user ought get it.
"No UTF32 for you!" - Perl Nazi.

> Perhaps the regex engine should always force UF8 form ?

I think we really want to store data internally in a common, Unicode format.

Simon

-- 
You're not Dave.  Who are you?
Re: Char encoding

Reply via email to