On Thursday, 6 September 2018 at 11:43:31 UTC, ag0aep6g wrote:

You say that D users shouldn't need a '"Unicode license" before they do anything with strings'. And you say that Python 3 gets it right (or maybe less wrong than D).

But here we see that Python requires a similar amount of Unicode knowledge. Without your Unicode license, you couldn't make sense of `len` giving different results for two strings that look the same.

So both D and Python require a Unicode license. But on top of that, D also requires an auto-decoding license. You need to know that `string` is both a range of code points and an array of code units. And you need to know that `.length` belongs to the array side, not the range side. Once you know that (and more), things start making sense in D.

You'll need some basic knowledge of Unicode, if you deal with strings, that's for sure. But you don't need a "license" and it certainly shouldn't be used as an excuse for D's confusing nature when it comes to strings. Unicode is confusing enough, so you don't need to add another layer of complexity to confuse users further. And most certainly you shouldn't blame the user for being confused. Afaik, there's no warning label with an accompanying user manual for string handling.

My point is: D doesn't require more Unicode knowledge than Python. But D's auto-decoding gives `string` a dual nature, and that can certainly be confusing. It's part of why everybody dislikes auto-decoding.

D should be clear about it. I think it's too late for `string` to change its behavior (i.e. "รก".length = 1). If you wanna change `string`'s behavior now, maybe a compiler switch would be an option for the transition period: -autodecode=off.

Maybe a new type of string could be introduced that behaves like one would expect, say `ustring` for correct Unicode handling. Or `string` does that and you introduce a new type for high performance tasks (`rawstring` would unfortunately be confusing).

The thing is that even basic things like string handling are complicated and flawed so that I don't want to use D for any future projects and I don't have the time to wait until it gets fixed one day, if it ever will get fixed that is. Neither does it seem to be a priority as opposed to other things that are maybe less important for production. But at least I'm wiser after this thread, since it has been made clear that things are not gonna change soon, at least not soon enough for me.

This is why I'll file for D-vorce :) Will it be difficult? Maybe at the beginning, but it will make things easier in the long run. And at the end of the day, if you have to fix and rewrite parts of your code again and again due to frequent language changes, you might as well port it to a different PL altogether. But I have no hard feelings, it's a practical decision I had to make based on pros and cons.

[snip]


Reply via email to