(I'm not really following Perl 6, but Unicode is obviously something
I have a concern about. Please *do* CC me replies, just this once.)
On Sat, Aug 05, 2000 at 11:16:46AM +0000, Nick Ing-Simmons wrote:
> Agreed - but that is due to grafting it in late - and possibly
> trying to be too clever intuiting whether existing perl5-code is
> working on bytes or chars.
This is why we should:
i) Make the choice of internal encoding (UTF-8/UTF-16/UTF-32) decidable
at compile time.
ii) Deal with strings internally through pluggable support routines.
iii) Never assume bytes.
iv) Provide the user a method of converting their input and output to and
from the UTF Perl uses.
> But the goal was to avoid a 100Mbyte ASCII "string" becoming a 400Mbyte
> UTF32 "string" with 300Mbytes of 0x000000.
Hey, if the user wants it, the user ought get it.
"No UTF32 for you!" - Perl Nazi.
> Perhaps the regex engine should always force UF8 form ?
I think we really want to store data internally in a common, Unicode format.
Simon
--
You're not Dave. Who are you?