On 8/9/18 2:44 AM, Walter Bright wrote:
On 8/8/2018 2:01 PM, Steven Schveighoffer wrote:
Here's where I'm struggling -- because a string provides indexing, slicing, length, etc. but Phobos ignores that. I can't make a new type that does the same thing. Not only that, but I'm finding the specializations of algorithms only work on the type "string", and nothing else.

One of the worst things about autodecoding is it is special, it *only* steps in for strings. Fortunately, however, that specialness enabled us to save things with byCodePoint and byCodeUnit.

So it turns out that technically the problem here, even though it seemed like an autodecoding problem, is a problem with splitter.

splitter doesn't deal with encodings of character ranges at all.

For instance, when you have this:

"abc 123".byCodeUnit.splitter;

What happens is splitter only has one overload that takes one parameter, and that requires a character *array*, not a range.

So the byCodeUnit result is aliased-this to its original, and surprise! the elements from that splitter are string.

Next, I tried to use a parameter:

"abc 123".byCodeUnit.splitter(" ");

Nope, still devolves to string. It turns out it can't figure out how to split character ranges using a character array as input.

The only thing that does seem to work is this:

"abc 123".byCodeUnit.splitter(" ".byCodeUnit);

But this goes against most algorithms in Phobos that deal with character ranges -- generally you can use any width character range, and it just works. Having a drop-in replacement for string would require splitter to handle these transcodings (and I think in general, algorithms should be able to handle them as well). Not only that, but the specialized splitter that takes no separator can split on multiple spaces, a feature I want to have for my drop-in replacement.

I'll work on adding some issues to the tracker, and potentially doing some PRs so they can be fixed.

-Steve

Reply via email to