Re: Origins of ẘ

Ken Whistler Mon, 16 Apr 2012 12:15:15 -0700

On 4/15/2012 10:04 PM, Asmus Freytag wrote:

The 1E00 and 1F00 blocks were populated, in Unicode 1.1 by rejectsfrom Unicode 1.0 that were re-admitted as part of the merger withISO/IEC 10646. If you have anyone with access to the early (paperonly) meeting documents of WG2, you might, just might, find a sourcefor them.

Well, guess what -- I have access to someone with the relevant meetingdocuments. ;-)


The first key document is:

WG2 N754, Review of repertoire, by Masami Hasegawa, dated September 1991.

(Mark Davis and I assisted Hasegawa-san in pulling together the lists inthis

document.)

That document lists *all* of the Latin composite letter collections thatHasegawa-san,then the editor of 10646, had to wrestle with, in order to come up withan acceptabledraft for the 2nd DIS, after the failure of the first DIS vote and thedetermination byWG2 that a merger of repertoires was necessary to construct a DIS thatcould pass.(A lot of other architectural changes were necessary as well, but rightnow I'm

focussing on the Latin repertoire issue.)

Section 1.1.2 of WG2 N754 reads:

===============================================================

1.1.2 Latin Composites, Collection #2A

Extra Latin composites, descending from DP1 of 10646. These are derivedfrom a

variety of sources, and are intended to cover a number of languages and

transcriptional systems (e.g. various Indo-European and Semitictranscriptions).


===============================================================

There then follows a long list of composite characters that were in DP1of 10646.

WG2 N754 then goes on to identify which of those particular characters were

supported by explicit national requirements in the ballot record. Theremainderwere winnowed down, using a list of exceptions, explicitly spelled outon page 8

of WG2 N754. What was left constituted the bulk of the composite Latin

characters that were eventually included in the 2nd DIS in the range1E00..1E95,

and which you see there still in the standard.

O.k., so far so good. But you may well ask, what about 1E96..1E9A, whichincludes

the ẘ character? How did *those* get in?

Well, the pertinent document for that is WG2 N759, "Liaison Statementsto JTC1/SC2/WG2considering the Arabic part of ISO DIS 10646M", from the ECMA (EuropeanComputerManufacturers Association) Arabic Task Group, dated October 1991. Therelevantportion of that document is Appendix A (=ECMA ATG N213), "Tranliteration[sic] charactersfor Arabic characters and Hieroglyphs", authored by Alaa Ghoneim, who atthe timewas representing Egypt during the WG2 meetings. Alaa Ghoneim cites assources

ISO 233 Parts 1 and 2 (for Arabic) and the Egyptian Grammar by Gardiner for
Latin transliteration of hieroglyphs.

Part II of that Appendix says: "The following [9] characters do notexist in 10646and hence need to be added in plane 0", followed by 9 compositetransliteration

characters from one of those two sources -- not individually identified.

WG2 N759 was discussed at the Paris meeting of WG2 (October 7-11,1991). The

minutes from that meeting (WG2 N767) note:

"N759 ECMA Arabic TG Input and N746 Input from Egypt
1) 9 missing characters for transliteration
   ==> review all transliteration characters
..."

Hasegawa-san took that under advisement and determined that 3 of the

transliteration characters in that list of 9 were in fact already in thedraft of the DIS.

The remaining six are those which you now see in the range U+1E96..U+1E9A,
including the ẘ in question.

No national body objected to the inclusion of those particular 6 in thevoting on 10646 DIS 1.2,so they ended up published in the eventual 10646-1:1993 (and in Unicode1.1).


And that, folks, is the origin of ẘ.

--Ken

Re: Origins of ẘ

Reply via email to