Re: Unicode block for programming related symbols and codepoints?

Alfred Zett Mon, 09 Feb 2015 05:01:17 -0800

OK, I will now try to answer all of you in one mail, otherwise it getshard to overlook...


Shervin Afshar:

All of the requirements mentioned here can be (and are) implemented inhigher levels of software (like IDEs). IMO, there isn't any need foradding new characters to Unicode to address these issues.

But then it would be incompatible from IDE to IDE, like Python isincompatible using 2 spaces, 4 spaces and tabs.

It's the data that is important, not the software.

Additionally, people tend to forget that simply because Unicode isdoing emoji out of compatibility (or other) requirements, it does notmean that "now anything goes". I refer folks to TR51[1] (specificallysections 1.3, 8, and Annex C).
[1]: http://www.unicode.org/reports/tr51

You know, the fact that this consortium ever took emoji intoconsideration immediately justifies to include everything everyone everwanted. There is no such thing as important data including emoji. :)


Jean-Francois Colson:

I need a few tens of characters for a conlang I’m developping. ☺

Except two or three control characters don't make a con language.

Also, if you don't like con languages in Unicode, what's this:http://unicode.org/charts/PDF/U1F700.pdf

The problem is that Unicode only encodes characters which areeffectively used today or which have been used in the past. It doesn’tencode characters which could perhaps be used in a hypothetical newprograming language in the future.

So you want the font encoding scheme to be a limitating factor for newthings?


Pierpaolo Bernardi:

How would your proposed character be displayed as plain text?

There is no such thing as plain text.

Even line breaks and tabs are a matter of interpretation. It's just thatthey usually have typographic semantics, even in programming editors,with all the side effects.

In very simple (and with that I mean shitty or not even remotelyprogramming oriented) editors, it may show like a control character, like ␄.

Browsers and any editor passing the "based on scintilla" complexity markof course should display something that makes more sense, like an arrowor ⍈ plus surrounding space.

Unicode is a standard for plain text.  If you require a special IDE
for your programming language then why use plain text at all?

Because binary custom encoded databases or blob files are the death ofinteroperability.


Konstantin Ritt:

Easier than latin1, a layout one could find on [almost] everykeyboard? Good luck.

Also:

Jean-Francois Colson:

Hard to input? Not harder than the new symbols you’d like to propose.That’s only a matter of keyboard layout and input method.

Indent by pressing tab and insert the literal thing by pressing ".Nothing changes, the IDE/editor does the work on the fly.

Just that you have clean semantics, interoperability and customizability.

Beat that, APL. Where you would >10 key bindings or an annoying softwarekeyboard.

I’ve never used APL so I don’t remember the meanings of its symbols,but couldn’t ⍘ U+2358 APL FUNCTIONAL SYMBOL QUOTE UNDERBAR or ⍞ U+235EAPL FUNCTIONAL SYMBOL QUOTE QUAD work as “string litteral quotes” in anew programming language?

That's a good idea.

That still leaves the indentation character, which is harder than that,because one would want a control character with certain semantics.E.G.: For programming editors it would make sense to only allow it afterline breaks and convert other occurences into tabs.

If the IDE inputs your new character when you press tab, then your newcharacter is a tab…

Not if it detects the beginning of a line.

Best regards

A. Z.

_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode

Re: Unicode block for programming related symbols and codepoints?

Reply via email to