On Wed, 01 Dec 2010 03:30:07 -0500
foobar <[email protected]> wrote:

> Steven Schveighoffer Wrote:
> [snipped]
> > > 3. You have no access to the underlying array unless you're dealing with  
> > > an
> > > actual array of dchar.
> > 
> > I thought of adding some kind of access.  I wasn't sure the best way.
> > 
> > I was thinking of allowing direct access via opCast, because I think  
> > casting might be a sufficient red flag to let you know you are crossing  
> > into dangerous waters.
> > 
> > But it could just be as easy as making the array itself public.
> > 
> 
> > -Steve
> 
> A string type should always maintain the invariant that it is a valid unicode 
> string. Therefore I don't like having an unsafe opCast or providing direct 
> access to the underlying array. I feel that there should be a read-only 
> property for that. Algorithms that manipulate char[]'s should construct a new 
> string instance which will validate the char[] it is being built from is a 
> valid utf string.

But then, why not store a dchar[] array systematically? Validation and decoding 
is the same job. Once decoded, all methods work as expected (eg s[3] returns 
the 4th code point) and blitz fast.

> This looks like a great start for a proper string type. There's still the 
> issue of literals that would require compiler/language changes.

Yop...

> There's one other issue that should be considered at some stage: 
> normalization and the fact that a single "character" can be constructed from 
> several code points. (acutes and such) 

This is my next little project. May build on Steve's job. (But it's not 
necessary, dchar is enough as a base, I guess.)


Denis
-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com

Reply via email to