Re: Why incomplete subscript/superscript alphabet ?

Frédéric Grosshans Wed, 05 Oct 2016 10:06:43 -0700

Le 05/10/2016 à 15:57, Marcel Schneider a écrit :

On Wed, 5 Oct 2016 14:27:44 +0900, Martin J. Dürst wrote:

On 2016/10/04 19:35, Marcel Schneider wrote:

On Mon, 3 Oct 2016 13:47:09 -0700, Asmus Freytag (c) wrote:

Later, the beta and gamma were encoded for phonetic notation, but not the
alpha.

As a result, you can write basic formulas for select compounds, but not all.
Given that these basic formulae don't need full 2-D layout, this still seems
like an arbitrary restriction.

When itʼs about informatics, arbitrary restrictions are precisely what gets me
upset. Those limitations are—as I wrote the other day—a useless worsening
of the usability and usefulness of a product.

This kind of "let's avoid arbitrary limitations" argument works very
well for subjects that are theoretical, straightforward, and rigid in
nature. Many (but not all) subjects in computer science (informatics)
are indeed of such a nature.

The Unicode Consortium (or more specifically, the UTC) does a lot of
hard work to create theories where appropriate, and to explain them
where possible. But they recognize (and we should do so, too) that in
the end, writing is a *cultural* phenomenon, where straightforward,
rigid theories have severe limitations.

 From a certain viewpoint (the chemist's in the example above), the
result may look arbitrary, but from another viewpoint (the
phoneticist's), it looks perfectly fine. At first, it looks like it
would be easy to fix such problems, but each fix risks to introduce new
arbitrariness when seen from somebody else's viewpoint. Getting upset
won't help.

Iʼve got the point, thanks. Phonetics need to write running text that is
immediately legible, while a chemistry database may use particular notational
conventions that work with baseline letters to be parsed on semantics or light
markup for proper display in the UI. The UTC decision thus questioned the design
principle of using plain text for chemical formulae. No doubt it was understood
that validating this choice would have opened the door to encoding more special
characters for upgrading or similar purposes.

I think there is a big difference between adding a few characters for anew use (chemistry formulae) and completing an obvious almost completeset. People are used to see the 26 basic alphabetic Latin character(abcdefghijklmnopqrstuvwxyz) being treated preferentially by computers,but are always surprised when only one of them is treated differently.Initially, superscript letters where restricted to a few letter, and itmade sense to restrict the temptation to complete the set. But now thatall modifier small latin letters except q are encoded, it makes littlesense. Many people use these characters (arguably wrongly) for many usesbeyond IPA, and they are invariably surprised if they need q. Thespecial status of the basic Latin alphabet means that almost no onewould be surprised not to find a superscripted α, è, or ∞ and adding thelast missing latin basic letter q would not open the door to any morecharacter.


At this point Iʼd like to mention what I thought about since this thread
was launched. The French language makes extensive use of superscripts
to note abbreviations. [...] Therefore I suggest to grant
the French language full support by enabling superscript lowercase letters
in order that the SUPERSCRIPT deadkey that the French Standards body recommends,
will work for all abreviations. There is no point about other letters than the 
basic
alphabet superscripted, as no French abbreviation exceeds this range (despite of
what I believed in 2014, like many other people).

Whether è (and í) are needed or not is another question. Even if it wereuseful (as argued ny others in this thread), it brings non trivialtechnical difficulties in terms of NFC/NFD. But since people are used tosee these characters being treated differently, I think the “problem” ofthe lack of superscript composed character is less obvious than the lackof *MODIFIER LETTER SMALL Q, in the sense that the first absence isperceived (by the Unicode naive user) as more normal than the second.


  Frédéric

Re: Why incomplete subscript/superscript alphabet ?

Reply via email to