I'm a bit dense for weighing in on this as my first post, but what the heck.
Our problem isn't ASCII or Unicode, our problem is how we use computers. Going back in time a bit, the first keyboards only recorded letters and spaces, even line breaks required manual intervention. As things developed, we upgraded our input capabilities a little bit (return keys! delete keys! arrow keys!), but then, some time before graphical displays came along, we stopped upgrading. We stopped increasing the capabilities of our input, and instead focused on kludges to make them do more. We created markup languages, modifier keys, and page description languages, all because our input devices and display devices lacked the ability to comprehend anything more than letters. Now we're in a position where we have computers with rich displays bolted to a keyboard that has remained unchanged for 150 years. Unpopular opinion time: Markup languages are a kludge, relying on plain text to describe higher level concepts. TeX has held us back. It's a crutch so religiously embraced by the people that make our software that the concept of markup has come to be accepted "the way". I worked with some university students recently, who wasted a ridiculous amount of time learning to use LaTeX to document their projects. Many of them didn't even know that page layout software existed, they thought there was this broad valley in capabilities with TeX on one side, and Microsoft Word on the other. They didn't realize that there is a whole world of purpose built tools in between. Rather than working on developing and furthering our input capabilities, we've been focused on keeping them the same. Markup languages aren't the solution. They are a clumsy bridge between 150 year old input technology and modern display capabilities. Bold or italic or underlined text shouldn't be a second class concept, they have meaning that can be lost when text is conveyed in circa-1868-plain-text. I've read many letters that predate the invention of the typewriter, emphasis is often conveyed using underlines or darkened letters. We've drawn this arbitrary line in the sand, where only letters that can be typed on a typewriter are "text", Everything else is fluff that has been arbitrarily decided to convey no meaning. I think it's a safe argument to make that the primary reason we've painted ourselves into this unexpressive corner is because of a dogged insistence that we cling to the keyboard. I like the C comment example; Why do I need to call out a comment with a special sequence of letters? Why can't a comment exist as a comment? Why is a comment a second class concept? When I take notes in the margin, I don't explicitly need to call them out as notes. This extends to strings, why do I need to use quotes? I know it's a string why can't the computer remember that too? Why do I have to use the capabilities of a typewriter to describe that to the computer? There seems to be confusion that computers are inherently text based. They are only that way because we program them and use them that way, and because we've done it the same way since the day of the teletype, and it's _how it's done._ "Classic" Macs are a great example of breaking this pattern. There was no way to force the computer into a text mode of operating, it didn't exist. Right down to the core the operating system was graphical. When you click an icon, the computer doesn't issue a text command, it doesn't call a function by name, it merely alters the flow of some binary stuff flowing through the CPU in response to some other bits changing. Yes, the program describing that was written in text, but that text is not what the computer is interpreting. I'm getting a bit philosophical, so I'll shut up now, but it's an interesting discussion. - Keelan
