[Doing a little cut and pasting here to coalesce the context...] > Peter Kirk wrote, > > > > I note an incorrect glyph for U+0185 in Code2000 and in Arial Unicode > > MS; this looks like b with no serif at the bottom but should be much > > shorter, like ь, the Cyrillic soft sign. >
James Kass responded: > ... With regards to U+0185, could it be > said that the informative glyph in TUS 2.0, 3.0 and 4.0 is a bit > misleading, or does that glyph represent a variance from the > text(s) with which you're familiar? > > This page uses a scan from THE LANGUAGES OF THE WORLD > as its Chuang example: > http://www.worldlanguage.com/Languages/Chuang.htm > > No sample text, no lower case illustration: > http://www.alphabets-world.com/chuang.html > > If the informative glyph in TUS *is* misleading, I'll be happy > to make appropriate changes here. Peter Kirk responded: > Yes, you are right, and using a very British hyperbole [recte: litotes]. > The TUS 4.0 > glyph is quite simply incorrect. That is, it is incorrect for the > Azerbaijani, Khakass and Nogai letter, and it does not make a proper > distinction from the otherwise almost identical "b". The glyph should > have the same height as most lower case letters. ... That is, shorter > than the reference glyph in TUS 4.0. This reference > glyph needs to be changed. I would suggest a form identical to U+0446. Before we go charging off to fix all the fonts, we first need to have clarity regarding which characters are intended for what here. Michael Everson has asserted that U+0184/U+0185 *are* the intended characters for the Pan-Turkic Latin alphabetic use of the Cyrillic soft sign letter. This is at odds with the history of the Unicode Standard and with Michael's own prior assertion in: http://www.evertype.com/standards/iso10646/pdf/turkmen.pdf "Latin <soft sign> [is] not encoded in the UCS, complicating things like monolingual multiscript ordering since the current UCS expects Cyrillic <soft sign> to do double duty." [2000-06-02] That earlier statement by Michael correctly reflects the intent of the standard, I believe. It also correctly reflects Michael's observation earlier today: > In Pan-Turkic, though, it looks just like CYRILLIC SOFT SIGN in all > the sources I have seen. For lots of languages. And the Unicode solution for that, to date, has been that since it "looks just like" the CYRILLIC SOFT SIGN in all the sources, by gum, it *is* the CYRILLIC SOFT SIGN. [Now don't pile on all at once regarding mixed scripts for alphabets and rehearsing for the umpteenth time the arguments about Kurdish Q/W. We've heard all that, and there are abiding philosophical differences in the committees regarding when letters borrowed from one script into another become nativized into that script and require separate encoding. That is all for another thread. What I am telling you all here is what the *intent* of the standard has been regarding this *particular* pair of letters, since 1991.] The upshot of that is that the glyphs for U+0184/U+0185 are not to be determined by Azeri/Khakass/Nogai typography, but by Zhuang typography, for which they were encoded. The glyphs for U+042C/U+044C are correct for representing the soft sign in the Pan-Turkic alphabet because, well, they *are* the soft sign. Now, let's review the intent for Zhuang orthography. (aka Chuang) Based on sources such as Katzner (cited in this thread on available on the web) and Nakanishi, the 5 Zhuang tone letters were encoded in Unicode as: Tone 2: U+01A7/U+01A8 (reversed s) Tone 3: U+0417/U+0437 (Cyrillic ze) Tone 4: U+0427/U+0447 (Cyrillic che) Tone 5: U+01BC/U+01BD (roughly 5-shaped letter) Tone 6: U+0184/U+0185 (similar to soft-sign, but not identical) Everyone recognizes that the tone letters were mnemonically based on 2, 3, 4, 5, 6, as well, but there was no point in actually *using* the digits, as the tone letters are actually shaped differently and their usage would interfere with the use of normal digits in Zhuang text. The Unicode shapes and tone letter identities for Zhuang are roughly consonant with those also shown at: http://www.alphabets-world.com/chuang.html except that the glyph for Tone 4 there is much less che-like in shape, but still not actually a "4". Running text citations, as in Katzner, clearly show Cyrillic ze and che in use for those tones. The debatable edge case was always for tone 6, where you could argue that the Zhuang citations were merely an "off" shape for a Cyrillic soft sign that happened to be used in the text. But as for tones 2 and 5, the more conservative approach taken at the time, in 1990, was to simply identify Zhuang tone 6 as a distinct form, not identical to the soft sign, and so it was separately encoded at U+0184/U+1085. Note that there are more modern representations of Zhuang that dispense with the special tone letters altogether and substitute out ordinary Latin letters, in a Pinyin-like simplification. See: http://www.liuzhou.co.uk/liuzhou/language.htm with a sign showing the substitution of Latin J, H, Z, X, W(?) for the 5 Zhuang tone letters. This may reflect an official attempt to establish a new Latin orthography for Zhuang. See: http://www.infomekong.com/zhuang_secondary.htm "The language was not written down until the government made an attempt in the early 1950's, but they chose to use a Russian script [sic] and it was never accepted by the people. A new Latin script was devised in 1986 and the government through the Minorities Language Commission has encouraged Zhuang to learn this." For more background on the political context of Zhuang orthography development, see: http://brj.asu.edu/v2512/articles/art8.html In particular, the about-face by the central government regarding minority community policies in the late 50's impacted the history of the Zhuang orthography's use: "In the middle 1960s, the new Zhuang, Lisu, and Lahu written languages were withdrawn from the few schools where they had survived the promotion of Chinese in the late 1950s." I presume that the 1986 orthography is what is shown in the Liuzhou sign noted above. So in any case we may be talking about the encoding of the tone letters for a failed attempt at establishing a Latin/Cyrillic hybrid orthography that failed in the late 1950's and early 1960's in China. It is unclear to me whether the revival of the use of written Zhuang in the 1980's is based on the original Zhuang forms or a revision of them without the Cyrillic-based additions and tone letters. Perhaps someone on the list who knows more about the actual history of orthographic reform in the Zhuang Autonomous Region of Guangxi could chime in with more details. --Ken

