On 4/15/2012 10:04 PM, Asmus Freytag wrote:
The 1E00 and 1F00 blocks were populated, in Unicode 1.1 by rejects from Unicode 1.0 that were re-admitted as part of the merger with ISO/IEC 10646. If you have anyone with access to the early (paper only) meeting documents of WG2, you might, just might, find a source for them.

Well, guess what -- I have access to someone with the relevant meeting documents. ;-)

The first key document is:

WG2 N754, Review of repertoire, by Masami Hasegawa, dated September 1991.
(Mark Davis and I assisted Hasegawa-san in pulling together the lists in this
document.)

That document lists *all* of the Latin composite letter collections that Hasegawa-san, then the editor of 10646, had to wrestle with, in order to come up with an acceptable draft for the 2nd DIS, after the failure of the first DIS vote and the determination by WG2 that a merger of repertoires was necessary to construct a DIS that could pass. (A lot of other architectural changes were necessary as well, but right now I'm
focussing on the Latin repertoire issue.)

Section 1.1.2 of WG2 N754 reads:

===============================================================

1.1.2 Latin Composites, Collection #2A

Extra Latin composites, descending from DP1 of 10646. These are derived from a
variety of sources, and are intended to cover a number of languages and
transcriptional systems (e.g. various Indo-European and Semitic transcriptions).

===============================================================

There then follows a long list of composite characters that were in DP1 of 10646.
WG2 N754 then goes on to identify which of those particular characters were
supported by explicit national requirements in the ballot record. The remainder were winnowed down, using a list of exceptions, explicitly spelled out on page 8
of WG2 N754. What was left constituted the bulk of the composite Latin
characters that were eventually included in the 2nd DIS in the range 1E00..1E95,
and which you see there still in the standard.

O.k., so far so good. But you may well ask, what about 1E96..1E9A, which includes
the ẘ character? How did *those* get in?

Well, the pertinent document for that is WG2 N759, "Liaison Statements to JTC1/SC2/WG2 considering the Arabic part of ISO DIS 10646M", from the ECMA (European Computer Manufacturers Association) Arabic Task Group, dated October 1991. The relevant portion of that document is Appendix A (=ECMA ATG N213), "Tranliteration [sic] characters for Arabic characters and Hieroglyphs", authored by Alaa Ghoneim, who at the time was representing Egypt during the WG2 meetings. Alaa Ghoneim cites as sources
ISO 233 Parts 1 and 2 (for Arabic) and the Egyptian Grammar by Gardiner for
Latin transliteration of hieroglyphs.

Part II of that Appendix says: "The following [9] characters do not exist in 10646 and hence need to be added in plane 0", followed by 9 composite transliteration
characters from one of those two sources -- not individually identified.

WG2 N759 was discussed at the Paris meeting of WG2 (October 7-11, 1991). The
minutes from that meeting (WG2 N767) note:

"N759 ECMA Arabic TG Input and N746 Input from Egypt
1) 9 missing characters for transliteration
   ==> review all transliteration characters
..."

Hasegawa-san took that under advisement and determined that 3 of the
transliteration characters in that list of 9 were in fact already in the draft of the DIS.
The remaining six are those which you now see in the range U+1E96..U+1E9A,
including the ẘ in question.

No national body objected to the inclusion of those particular 6 in the voting on 10646 DIS 1.2, so they ended up published in the eventual 10646-1:1993 (and in Unicode 1.1).

And that, folks, is the origin of ẘ.

--Ken

Reply via email to