On 9/22/18 12:56 PM, Neia Neutuladh wrote:
On Saturday, 22 September 2018 at 12:35:27 UTC, Steven Schveighoffer wrote:
But aren't we arguing about the wrong thing here? D already accepts non-ASCII identifiers.

Walter was doing that thing that people in the US who only speak English tend to do: forgetting that other people speak other languages, and that people who speak English can learn other languages to work with people who don't speak English.

I don't think he was doing that. I think what he was saying was, D tried to accommodate users who don't normally speak English, and they still use English (for the most part) for coding.

I'm actually surprised there isn't much code out there that is written with other identifiers besides ASCII, given that C99 supported them. I assumed it was because they weren't supported. Now I learn that they are supported, yet almost all C code I've ever seen is written in English. Perhaps that's just because I don't frequent foreign language sites though :) But many people here speak English as a second language, and vouch for their cultures still using English to write code.

He was saying it's inevitably a mistake to use non-ASCII characters in identifiers and that nobody does use them in practice.

I would expect people probably do try to use them in practice, it's just that the problems they run into aren't worth the effort (tool/environment support). But I have no first or even second hand experience with this. It does seem like Walter has a lot of experience with it though.

Walter talking like that sounds like he'd like to remove support for non-ASCII identifiers from the language. I've gotten by without maintaining a set of personal patches on top of DMD so far, and I'd like it if I didn't have to start.

I don't think he was saying that. I think he was against expanding support for further Unicode identifiers because the the first effort did not produce any measurable benefit. I'd be shocked from the recent positions of Walter and Andrei if they decided to remove non-ASCII identifiers that are currently supported, thereby breaking any existing code.

What languages need an upgrade to unicode symbol names? In other words, what symbols aren't possible with the current support?

Chinese and Japanese have gained about eleven thousand symbols since Unicode 2.

Unicode 2 covers 25 writing systems, while Unicode 11 covers 146. Just updating to Unicode 3 would give us Cherokee, Ge'ez (multiple languages), Khmer (Cambodian), Mongolian, Burmese, Sinhala (Sri Lanka), Thaana (Maldivian), Canadian aboriginal syllabics, and Yi (Nuosu).

Very interesting! I would agree that we should at least add support for unicode symbols that are used in spoken languages, especially if we already have support for symbols that aren't ASCII already. I don't see the downside, especially if you can already use Unicode 2.0 symbols for identifiers (the ship has already sailed).

It could be a good incentive to get kids in countries where English isn't commonly spoken to try D out as a first programming language ;) Using your native language to show example code could be a huge benefit for teaching coding.

My recommendation is to put the PR up for review (that you said you had ready) and see what happens. Having an actual patch to talk about could change minds. At the very least, it's worth not wasting your efforts that you have already spent. Even if it does need a DIP, the PR can show that one less piece of effort is needed to get it implemented.

-Steve

Reply via email to