Le 05/10/2016 à 15:57, Marcel Schneider a écrit :
On Wed, 5 Oct 2016 14:27:44 +0900, Martin J. Dürst wrote:
On 2016/10/04 19:35, Marcel Schneider wrote:
On Mon, 3 Oct 2016 13:47:09 -0700, Asmus Freytag (c) wrote:

Later, the beta and gamma were encoded for phonetic notation, but not the
alpha.

As a result, you can write basic formulas for select compounds, but not all.
Given that these basic formulae don't need full 2-D layout, this still seems
like an arbitrary restriction.
When itʼs about informatics, arbitrary restrictions are precisely what gets me
upset. Those limitations are—as I wrote the other day—a useless worsening
of the usability and usefulness of a product.
This kind of "let's avoid arbitrary limitations" argument works very
well for subjects that are theoretical, straightforward, and rigid in
nature. Many (but not all) subjects in computer science (informatics)
are indeed of such a nature.

The Unicode Consortium (or more specifically, the UTC) does a lot of
hard work to create theories where appropriate, and to explain them
where possible. But they recognize (and we should do so, too) that in
the end, writing is a *cultural* phenomenon, where straightforward,
rigid theories have severe limitations.

 From a certain viewpoint (the chemist's in the example above), the
result may look arbitrary, but from another viewpoint (the
phoneticist's), it looks perfectly fine. At first, it looks like it
would be easy to fix such problems, but each fix risks to introduce new
arbitrariness when seen from somebody else's viewpoint. Getting upset
won't help.
Iʼve got the point, thanks. Phonetics need to write running text that is
immediately legible, while a chemistry database may use particular notational
conventions that work with baseline letters to be parsed on semantics or light
markup for proper display in the UI. The UTC decision thus questioned the design
principle of using plain text for chemical formulae. No doubt it was understood
that validating this choice would have opened the door to encoding more special
characters for upgrading or similar purposes.
I think there is a big difference between adding a few characters for a new use (chemistry formulae) and completing an obvious almost complete set. People are used to see the 26 basic alphabetic Latin character (abcdefghijklmnopqrstuvwxyz) being treated preferentially by computers, but are always surprised when only one of them is treated differently. Initially, superscript letters where restricted to a few letter, and it made sense to restrict the temptation to complete the set. But now that all modifier small latin letters except q are encoded, it makes little sense. Many people use these characters (arguably wrongly) for many uses beyond IPA, and they are invariably surprised if they need q. The special status of the basic Latin alphabet means that almost no one would be surprised not to find a superscripted α, è, or ∞ and adding the last missing latin basic letter q would not open the door to any more character.


At this point Iʼd like to mention what I thought about since this thread
was launched. The French language makes extensive use of superscripts
to note abbreviations. [...] Therefore I suggest to grant
the French language full support by enabling superscript lowercase letters
in order that the SUPERSCRIPT deadkey that the French Standards body recommends,
will work for all abreviations. There is no point about other letters than the 
basic
alphabet superscripted, as no French abbreviation exceeds this range (despite of
what I believed in 2014, like many other people).
Whether è (and í) are needed or not is another question. Even if it were useful (as argued ny others in this thread), it brings non trivial technical difficulties in terms of NFC/NFD. But since people are used to see these characters being treated differently, I think the “problem” of the lack of superscript composed character is less obvious than the lack of *MODIFIER LETTER SMALL Q, in the sense that the first absence is perceived (by the Unicode naive user) as more normal than the second.

  Frédéric

Reply via email to