On Tue, 11 Jan 2011 11:54:08 -0500, Andrei Alexandrescu
<[email protected]> wrote:
On 1/11/11 5:30 AM, Steven Schveighoffer wrote:
While this makes it possible to write algorithms that only accept
VLERanges, I don't think it solves the major problem with strings --
they are treated as arrays by the compiler.
Except when they're not - foreach with dchar...
This solitary difference is a very thin argument -- foreach(d;
byDchar(str)) would be just as good without requiring compiler help.
I'd also rather see an indexing operation return the element type, and
have a separate function to get the encoding unit. This makes more sense
for generic code IMO.
But that's neither here nor there. That would return the logical element
at a physical position. I am very doubtful that much generic code could
work without knowing they are in fact dealing with a variable-length
encoding.
It depends on the function, and the way the indexing is implemented.
I noticed you never commented on my proposed string type...
That reminds me, I should update with suggested changes and re-post it.
To be frank, I think it didn't mark a visible improvement. It solved
some problems and brought others. There was disagreement over the
offered primitives and their semantics.
It is supposed to be simple, and provide the expected interface, without
causing any undue performance degradation. That is, I should be able to
do all the things with a replacement string type that I can with a char
array today, as efficiently as I can today, except I should have to work
to get at the code-units. The huge benefit is that I can say "I'm dealing
with this as an array" when I know it's safe
The disagreement will never be fully solved, as there is just as much
disagreement about the current state of affairs ;) e.g. should foreach
default to using dchar?
That being said, it's good you are doing this work. In the best case,
you could bring a compelling abstraction to the table. In the worst,
you'll become as happy about D's strings as I am :o).
I don't think I'll ever be 'happy' with the way strings sit in phobos
currently. I typically deal in ASCII (i.e. code units), and phobos works
very hard to prevent that.
-Steve