On 3/9/14, 8:18 AM, Vladimir Panteleev wrote:
On Sunday, 9 March 2014 at 05:10:26 UTC, Andrei Alexandrescu wrote:
On 3/8/14, 8:24 PM, Vladimir Panteleev wrote:
On Sunday, 9 March 2014 at 04:18:15 UTC, Andrei Alexandrescu wrote:
What exactly is the consensus? From your wiki page I see "One of the
proposals in the thread is to switch the iteration type of string
ranges from dchar to the string's character type."

I can tell you straight out: That will not happen for as long as I'm
working on D.

Why?

From the cycle "going in circles": because I think the breakage is way
too large compared to the alleged improvement.

All right. I was wondering if there was something more fundamental
behind such an ultimatum.

It's just factual information with no drama attached (i.e. I'm not threatening to leave the language, just plainly explain I'll never approve that particular change).

That said a larger explanation is in order. There have been cases in the past when our community has worked itself in a froth over a non-issue and ultimately caused a language change imposed by "the faction that shouted the loudest". The "lazy" keyword and recently the "virtual" keyword come to mind as cases in which the language leadership has been essentially annoyed into making a change it didn't believe in.

I am all about listening to the community's needs and desires. But at some point there is a need to stick to one's guns in matters of judgment call. See e.g. https://d.puremagic.com/issues/show_bug.cgi?id=11837 for a very recent example in which reasonable people may disagree but at some point you can't choose both options.

What we now have works as intended. As I mentioned, there is quite a bit more evidence the design is useful to people, than detrimental. Unicode is all about code points. Code units are incidental to each encoding. The fact that we recognize code points at language and library level is, in my opinion, a Good Thing(tm).

I understand that doesn't reach the ninth level of Nirvana and there are still issues to work on, and issues where good-looking code is actually incorrect. But I think we're overall in good shape. A regression from that to code unit level would be very destructive. Even a clear slight improvement that breaks backward compatibility would be destructive.

So I wanted to limit the potential damage of this discussion. It is made only a lot more dangerous that Walter himself started it, something that others didn't fail to tune into. The sheer fact that we got to contemplate an unbelievably massive breakage on no other evidence than one misuse case and for the sake of possibly an illusory improvement - that's a sign we need to grow up. We can't go like this about changing the language and aim to play in the big leagues.

In fact I believe that that design is inferior to the current one
regardless.

I was hoping we could come to an agreement at least on this point.

Sorry to disappoint.

---

BTW, a thought struck me while thinking about the problem yesterday.

char and dchar should not be implicitly convertible between one another,
or comparable to the other.

I think only the char -> dchar conversion works, and I can see arguments against it. Also comparison of char with dchar is dicey. But there are also cases in which it's legitimate to do that (e.g. assign ASCII chars etc) and this would be a breaking change.

One good way to think about breaking changes is "if this change were executed to perfection, how much would that improve the overall quality of D?" Because breakages _are_ "overall" - users don't care whether they come from this or the other part of the type system. Really puts things into perspective.

void main()
{
     string s = "Привет";
     foreach (c; s)
         assert(c != 'Ñ');
}

Instead, std.conv.to should allow converting between character types,
iff they represent one whole code point and fit into the destination
type, and throw an exception otherwise (similar to how it deals with
integer overflow). Char literals should be special-cased by the compiler
to implicitly convert to any sufficiently large type.

This would break more[1] code, but it would avoid the silent failures of
the earlier proposal.

[1] I went through my own larger programs. I actually couldn't find any
uses of dchar which would be impacted by such a hypothetical change.

Generally I think we should steer away from slight improvements of the language at the cost of breaking existing code. Instead, we must think of ways to improve the language without the breakage. You may want to pursue (bugzilla + pull request) adding the std.conv routines with the semantics you mentioned.


Andrei

Reply via email to