Le 5 mars 2012 19:35, Denis Jacquerye <[email protected]> a écrit : > On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy <[email protected]> wrote: >> I am looking for the codes or assignements status of the Cyrillic >> letter OE/oe (ligatured) as used in Selkup (exactly similar to the >> Latin pair). >> >> This character pair has been part of the registration nr. 223 (in >> 1998) by ISO of the (8-bit) "extended Cyrillic character set for >> non-Slavic languages for bibliographic information interchange" : >> >> http://www.itscj.ipsj.or.jp/sc2/open/02n3136.pdf >> >> According to this document, this character set had also been >> standardized as ISO 10756:1996. Note that it contains many other >> characters for which it did not document any mapping to the UCS in the >> then emerging ISO 10646 standard. >> >> It has even been part of proposals at the UTC and ISO the same year >> for including in the UCS, along with other characters (at that time, >> Michael Everson wrote a proposal, placing them in U+04EC, U+04ED, but >> since the, the slots have been used for other characters (that block >> is now full). >> >> It is also referenced in the ISO 9 Cyrillic/Latin transliteration standard. >> >> Still, there's no Cyrillic character I can find in the encoded UCS in >> other Cyrillic extended blocks that are not full (for example, the >> CYRILLIC SUPPLEMENT block at U+0500-052F). >> >> Where are those characters ? And what about the remaining characters >> found in the Registration nr. 223 and ISO 10756:1996 ? And their >> status in the ISO 9 standard itself ? >> >> Thanks. >> >> -- Philippe. >> > > According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the > Cyrillic Selkup OE is mapped to Latin OE: > CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE > CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE OE > Several other of those missing Cyrillic characters are simply mapped > to Latin ones or sort of decomposed.
Apparently this document is obsolete. Some of the proposed mappings to Latin have been encoded as plain Cyrillic letters such as: CYRILLIC SMALL LETTER KURDISH QA (not the initially proposed mapping to LATIN SMALL LETTER Q) This document was still a draft, and not a decision. The document specifically says "The issue with these letters is whether they should be deunified from Latin, and encoded in the Cyrillic block".

