On Sat, 31 Dec 2016 09:20:30 +0000, Richard Wordingham wrote: […] > It's in a different universe, restricted to one book, namely Footfall.
Thank you for the reference. […] > Did you look in the article about Klingon, namely > https://en.wikipedia.org/wiki/Klingon_language , or > in the article about Klingons, namely > https://en.wikipedia.org/wiki/Klingon ? The quote is from the former. Iʼve looked up the wrong one, didnʼt think of the language article. Thanks for the link. Iʼm now looking back at another quotation of yours, to spin off a new thread again about the topic that I urgently need to gather more information about: On Fri, 30 Dec 2016 22:17:12 +0000, Richard Wordingham wrote: > > On Fri, 30 Dec 2016 20:13:41 +0100 (CET) Marcel Schneider wrote: > > > > > U+2E31 WORD SEPARATOR MIDDLE DOT > > > U+30FB KATAKANA MIDDLE DOT > > > > These seem to me identical to U+00B7 and U+2022 respectively. Perhaps > > weʼre here faced with two examples of what Asmus referred to as > > “incorrectly encoded more than once” (talking of “Many other "simple" > > marks: lines, circles, triangles, hooks, and squares, or groups of > > them”). > > I was talking about what "fuels the misperception that Unicode somehow > encodes symbols based on a single conventional usage". I persist believing that particular scripts like Avestan and Samaritan Aramaic can require special characters like the WORD SEPARATOR MIDDLE DOT. Not fueling a misperception of Unicode character encoding couldʼt drive the UTC to reject this (for version 5.2). The KATAKANA MIDDLE DOT in turn is a part of the standard since the beginning, like the BULLET. I imagine that a generic bullet may not be suitable for Katakana. To get an idea of how character encoding works, people wonʼt look at scripts they donʼt know. Given that there is a misperception, one way to not fuel it could be to encourage character re-use. Actually this is rather discouraged, as in the example of Latin modifier letters that are (basically) preformatted superscripts. TUS states that there is no functional difference between those that have the word SUPERSCRIPT in their name, and those that donʼt: TUS 9.0, §7.8, p. 327: | The superscript forms of the i and n letters can be found in the | Superscripts and Subscripts block (U+2070..U+209F). The fact that the latter | two letters contain the word “superscript” in their names instead of “modifier | letter” is an historical artifact of original sources for the characters, and | is not intended to convey a functional distinction in the use of these | characters in the Unicode Standard. http://www.unicode.org/versions/Unicode9.0.0/ch07.pdf#G24762 Probably that is intended to discourage their use as superscripts. Superscript digits too are confined to phonetics, and the use of superscript two and three in measurement units is merely tolerated, not encouraged: TUS 9.0, §22.4, p. 786: | In addition, superscript digits are used to indicate tone in transliteration | of many languages. The use of superscript two and superscript three is common | legacy practice when referring to units of area and volume in general texts. http://www.unicode.org/versions/Unicode9.0.0/ch22.pdf#G42931 Cnnsequently, the notation of the acceleration unit 'ms⁻²' doesnʼt seem to be sustained by Unicode. Though this may be considered a technical notation, so that there would be a reason to allow it. These examples are intended to demonstrate the ambiguity of the recommendation to use markup and rich text format whenever vertical alignment matters, except in phonetics. I suspect that political correctness with respect to non-Latin scripts could eventually have biased Unicode’s policy, whereas Western Arabic digits and Latin letters are probably the only characters to be used extensively in super- and subscript position. As a result, the misperception of Unicode as a one-codepoint-per-usage standard is even more fueled, and I can now better understand why our NB intended to have French ordinal indicator(s) encoded in Unicode aside the already existing superscript Latin small letter(s). But admitting that encoding new French ordinal indicators is a really good idea, Iʼm curious of the response of the UTC. However, given that the regular process will take two years, would Unicode agree that in the meantime, the modifier letters be put in their place on the on-coming keyboard layout? Marcel

