On Tue, 3 Jan 2017 09:31:42 +0100, Christoph Päper wrote: > > Among the possibilities, you include Unicode subscripts. > > Just for the sake of completeness.
This tends to conclude that preformatted subscripts are really an option here. The TUS snippets [1][2] and common practice show that whatever characters are on the keyboard, are used or re-used for superscripts, such as the degree sign as superscript o, and the feminine ordinal indicator as superscript a. Layouts are baffling inconsistent across countries; so the Belgian AZERTY layout has superscript three where its French (France) counterpart has an empty shift state, while SUPERSCRIPT ONE is missing on both, despite of the AltGr shift state being partially used, and all three being a part of Latin-1. Thus, the consciousness of the usefulness of a given character has not always a tight relation to its presence on the keyboard. In the Unicode era, this may tend to expand to the insight that the availability of an almost complete range of superscripts, and a set of subscripts, including Latin letters, calls the need to add them on national keyboard layouts to cater for the demand of increasingly important user groups and communities. Supporting this does eventually not require the Unicode Standard to be reworded, because TUS mainly reflects encoding principles and usage recommendations, without being a typography manual. TUS 9.0, §22.4, p. 786, explains that the recommendation not to use preformatted characters outside phonetics is a mere application of a design principle, regardless of the practical usefulness of the scheme. I note that in the snippet quoted below, the digit “‘DC0016’” is already messed up by copy-pasting it to plain text. By contrast, copying it from Adobe Reader to Microsoft Word brings the font size difference with it, but not the vertical alignment, presumably because the original specifies a custom subscript style that has no generic subscripting information and is not cross-platform compatible. This example highlights a serious downside of the markup-based representation scheme. As demonstrated with the apostrophe, a recommendation may be changed according to common practice, and reconsidered in the light of differently weighed rules and principles, in favor of what Asmus Freytag pointed on December 28ᵗʰ, 2016, in reply to Richard Wordingham: > > > > Ideal solutions can also be defeated by limited keyboard layouts. As a > > > > result, I have no idea whether the singular of "fithp" (one of Larry > > > > Niven's alien species) should be spelt with U+02BC or U+2019, though in > > > > ASCII I can just write "fi'". > > > > > > The only place where "uni" doesn't apply in Unicode is that there's never > > > just a single principle that applies, but always multiple ones that are > > > in tension --- and in the edge cases, the tension can be felt keenly. > > > As seen in another example in a 2015 thread on plain text custom fractions, the English Microsoft Community website is hosting recommendations on how to insert fractions made of superscripts, subscripts and the fraction slash U+2044, using a list of autocorrections in Word. To test, Iʼve added to the autocorrect list four items converting '.s.' to 'ˢᵗ', '.n.' to 'ⁿᵈ', '.r.' to 'ʳᵈ', '.t.' to 'ᵗʰ'. The result looks fine in Cambria, bad in uncomplete fonts mixed with a fallback font, while Arial has the superscript 'n' in a non-standard way, as a legacy remainder, despite of TUS specifying that all those characters should be harmonized. Itʼs up to the user to choose the best fitting option depending on usage and environment. As already discussed, formatting is a working solution at the condition that plain text will never be a requirement. I hope that this lengthy contribution may help to straighten the way for the users to feel free to use superscript and subscript characters the way they prefer. Marcel [1] TUS 9.0, §22.4, p. 786: | | In general, the Unicode Standard does not attempt to describe the positioning | of a character above or below the baseline in typographical layout. | Therefore, the preferred means to encode superscripted letters or digits, | such as “1st” or “DC0016”, is by style or markup in rich text. […] | In addition, superscript digits are used to indicate tone in transliteration | of many languages. The use of superscript two and superscript three is common | legacy practice when referring to units of area and volume in general texts. | http://www.unicode.org/versions/Unicode9.0.0/ch22.pdf#G42931 [2] TUS 9.0, §7.8, p. 327: | | The superscript forms of the i and n letters can be found in the | Superscripts and Subscripts block (U+2070..U+209F). The fact that the latter | two letters contain the word “superscript” in their names instead of “modifier | letter” is an historical artifact of original sources for the characters, and | is not intended to convey a functional distinction in the use of these | characters in the Unicode Standard. | | Superscript modifier letters are intended for cases where the letters carry | a specific meaning, as in phonetic transcription systems, and are not | a substitute for generic styling mechanisms for superscripting of text, | as for footnotes, mathematical and chemical expressions, and the like. | http://www.unicode.org/versions/Unicode9.0.0/ch07.pdf#G24762

