On Wed, 01 Dec 2010 03:30:07 -0500 foobar <[email protected]> wrote: > Steven Schveighoffer Wrote: > [snipped] > > > 3. You have no access to the underlying array unless you're dealing with > > > an > > > actual array of dchar. > > > > I thought of adding some kind of access. I wasn't sure the best way. > > > > I was thinking of allowing direct access via opCast, because I think > > casting might be a sufficient red flag to let you know you are crossing > > into dangerous waters. > > > > But it could just be as easy as making the array itself public. > > > > > -Steve > > A string type should always maintain the invariant that it is a valid unicode > string. Therefore I don't like having an unsafe opCast or providing direct > access to the underlying array. I feel that there should be a read-only > property for that. Algorithms that manipulate char[]'s should construct a new > string instance which will validate the char[] it is being built from is a > valid utf string.
But then, why not store a dchar[] array systematically? Validation and decoding is the same job. Once decoded, all methods work as expected (eg s[3] returns the 4th code point) and blitz fast. > This looks like a great start for a proper string type. There's still the > issue of literals that would require compiler/language changes. Yop... > There's one other issue that should be considered at some stage: > normalization and the fact that a single "character" can be constructed from > several code points. (acutes and such) This is my next little project. May build on Steve's job. (But it's not necessary, dchar is enough as a base, I guess.) Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com
