On 08/19/2010 09:22 PM, dsimcha wrote:
As I mentioned buried deep in another thread, std.string is in serious need of
fixing, for two reasons:

1.  Most of it doesn't work with UTF-16/UTF-32 strings.

2.  Much of it requires the input to be immutable even when there's no good
reason for this constraint.

Absolutely. Thanks for looking into this!

I'm trying to understand a few things before I dive into fixing it:

1.  How did it get to be this way?  Why did it seem like a good idea at the
time to only support UTF-8 and only immutable strings?

I don't know - my guess is that UTF-8 is widespread in English-speaking countries and this is one.

2.  Is there any "deep" design/technical issue that makes these hard to fix,
or is it basically just lack of manpower and other priorities?

The latter. I wanted to get to this for the longest time, and I think it's awesome that you're looking into it.

3.  Is there any good reason to avoid just templating everything to work with
all 9 string types (mutable/const/immutable char/wchar/dchar[]) or whatever
subset is reasonable for the given function?

There's no reason. But I hope we'd go a step further:

a) Aggressively make everything string-specific more general and move it into std.algorithm.

b) After (a) ideally std.string should contain only a modicum of string-specific stuff such as case and whitespace information. I believe the functionality of the following functions could easily be generalized and move to std.algorithm or std.range, perhaps consolidated with existing functionality and under a different name: cmp, indexOf, lastIndexOf, repeat, join, split, stripl, stripr, strip, chomp, chompPrefix, replace, replaceSlice, insert, count, maketrans, translate, squeeze, munch, succ, tr.

The other functions (or certain overloads of the above) stay put in std.string and should be indeed templated by input with the constraint

if (isSomeString!Str)

or better yet allow any input, forward, or bidirectional range (as the algorithm needs) constained by

if (isXxxRange!R && is(ElementType!R : dchar).

Thanks again for looking into this, it's important and rewarding work.


Andrei

Reply via email to