Re: [svn:perl6-synopsis] r14489 - doc/trunk/design/syn

Darren Duncan Thu, 10 Jan 2008 23:34:01 -0800

At 11:09 PM -0800 1/10/08, Larry Wall wrote:

It's really already very much like you want it to be.  Most Str objects
do not in fact have any byte semantics.  If you say "foo".bytes, that
is shorthand for "foo".bytes(:nf<c>, :enc<UTF-8>).  In other words,
you have to tell it what units you want the bytes to be measured in.
It just assumes utf-8 as a convenient default.  Likewise a Str does
not have any codepoint semantics unless you tell it the normalization
to assume.


Oh, that's good then.

Until now my interpretation of the Perl 6 situation is that while Strobjects were conceptually grapheme strings, which .graphs refers to,you could access the currently in-use implementation details of thatobject using .codes and .bytes et al. Timtoady (user choice ofabstraction level) and all that.

As such, in my own Muldis D language design, which is heavilyinfluenced by Perl 6, and has its character strings ashighest-possible-abstraction unicode (generally graphemes), I made apoint that all character string operations were more implementationagnostic, hence rather than 'graphs' or 'codes' there are'nfc_graphs' or 'nfd_codes' etc.

I'm glad to see, from your latest post, that this is how Perl 6actually works as well. That .codes specifically works in terms of aparticular normal-form (either a specified one or a default one)rather than the current implementation, and so makes this aspect ofPerl 6 a lot more deterministic while portable.


-- Darren Duncan

Re: [svn:perl6-synopsis] r14489 - doc/trunk/design/syn

Reply via email to