On Wed, 01 Dec 2010 03:30:07 -0500, foobar <[email protected]> wrote:

Steven Schveighoffer Wrote:
[snipped]
> 3. You have no access to the underlying array unless you're dealing with
> an
> actual array of dchar.

I thought of adding some kind of access.  I wasn't sure the best way.

I was thinking of allowing direct access via opCast, because I think
casting might be a sufficient red flag to let you know you are crossing
into dangerous waters.

But it could just be as easy as making the array itself public.


-Steve

A string type should always maintain the invariant that it is a valid unicode string. Therefore I don't like having an unsafe opCast or providing direct access to the underlying array. I feel that there should be a read-only property for that. Algorithms that manipulate char[]'s should construct a new string instance which will validate the char[] it is being built from is a valid utf string.

Copying is not a good idea, nor is runtime validation. We can only protect the programmer so much.

The good news is that the vast majority of strings are literals, which should be properly constructed by the compiler, and immutable.

This looks like a great start for a proper string type. There's still the issue of literals that would require compiler/language changes.

That is essential, the compiler has to defer the type of string literals to the library somehow.

There's one other issue that should be considered at some stage: normalization and the fact that a single "character" can be constructed from several code points. (acutes and such)

This is more solvable with a struct, but at this point, I'm not sure if it's worth worrying about. How common is that need?

-Steve

Reply via email to