On Saturday, 22 September 2018 at 08:52:32 UTC, Jonathan M Davis wrote:


Honestly, I was horrified to find out that emojis were even in Unicode. It makes no sense whatsover. Emojis are supposed to be sequences of characters that can be interepreted as images. Treating them like Unicode symbols is like treating entire words like Unicode symbols. It's just plain stupid and a clear sign that Unicode has gone completely off the rails (if it was ever on them). Unfortunately, it's the best tool that we have for the job.

According to the Unicode website, http://unicode.org/standard/WhatIsUnicode.html,

"""
Support of Unicode forms the foundation for the representation of languages and symbols in all major operating systems, search engines, browsers, laptops, and smart phones—plus the Internet and World Wide Web (URLs, HTML, XML, CSS, JSON, etc.)"""

Note, unicode supports symbols, not just characters.

The smiley face symbol predates its ':-)' usage in ascii text, https://www.smithsonianmag.com/arts-culture/who-really-invented-the-smiley-face-2058483/. It's fundamentally a symbol, not a sequence of characters. Therefore it is not unreasonable for it to be encoded with a unicode number. I do agree though, of course, that it would seem bizarre to use an emoji as a D identifier.

The early history of computer science is completely dominated by cultures who use latin script based characters, and hence, quiet reasonably, text encoding and its automated visual representation by compute based devices is dominated by the requirements of latin script languages. However, the world keeps turning and, despite DT's best efforts, China et al. look to become dominant. Even if not China, the chances are that eventually a non-latin script based language will become very important. Parochial views like "all open source code should be in ASCII" will look silly.

However, until that time D developers have to spend their time where it can be most useful. Hence the condition of whether to apply Neia's patch / ideas or not mainly depends on how much effort the donwstream effort will be (debuggers etc. as Walter pointed out), and how much the gain is. As unicode 2.0 is already supported I would take a guess that the vast majority of people with access to a computer can already enter identifiers in D that are rich enough for them. As Adam said though, it would be a good idea to at least ask!





Reply via email to