2011/12/31 Walter Bright <[email protected]>: > On 12/30/2011 11:09 PM, Andrei Alexandrescu wrote: >> >> On 12/30/11 10:09 PM, Walter Bright wrote: >>> >>> I'm not so sure about that. Timon Gehr's X macro tried to handle UTF-8 >>> correctly, but it turned out that the naive version that used [i] and >>> .length worked correctly. This is typical, not exceptional. >> >> >> The lower frequency of bugs makes them that much more difficult to spot. >> This is >> essentially similar to the UTF16/UCS-2 morass: in a vast majority of the >> time >> the programmer may consider UTF16 a coding with one code unit per code >> point >> (which is what UCS-2 is). The existence of surrogates didn't make much of >> a >> difference because, again, very often the wrong assumption just worked. >> Well >> that all didn't go over all that well. > > > I'm not so sure it's quite the same. Java was designed before there were > surrogate pairs, they kinda got the rug pulled out from under them. So, they > simply have no decent way to deal with it. There isn't even a notion of a > dchar character type. Java was designed with codeunit==codepoint, it is > embedded in the design of the language, library, and culture. > > This is not true of D. It's designed from the ground up to deal properly > with UTF. D has very simple language features to deal with it. > > >> We need .raw and we must abolish .length and [] for narrow strings. > > > I don't believe that fixes anything and breaks every D project out there. > We're chasing phantoms here, and I worry a lot about over-engineering > trivia. > > And, we already have a type to deal with it: dstring
I fully agree with Walter. No need more wrapper for string. Kenji Hara
