James:
>Uniscribe will provide glyph substitutions for glyphs encoded in the PUA
>under certain conditions. The Uniscribe looks for glyphs in a certain range
>based on the script, so if the first glyph ID in the GSUB or GPOS table is
>not within that script's range (i.e., it's in PUA or not mapped), the Uniscribe
>will not look it up.
>
>But if the first glyph ID is in the target script range, it will trigger the
>look-up.
>
>I've recently completed testing GSUB and GPOS for Unicode Bengali.
>
>For the conjunct form K-T-BA, the first step is to make the K-TA with
>U+0995 U+09CD U+09A4. We'll call the resulting glyph U+E001. The next
>step is to make the K-T-BA. U+E001 U+09CD U+09AC are the characters
>needed, but Uniscribe has already performed a re-ordering, so in the
>look-up table this string must appear as U+E001 U+09AC U+09CD. This
>seems to work just fine. Once the first substitution has occured, the
>resulting substitutions apparently consider that any new Glyph ID
>produced by a previous substitution *is* part of the target script range,
>and further substitutions will work.
It's not entirely clear to me what you are conveying, but I think you are talking in terms of OpenType lookups, in which case these are all operating on glyph IDs and not character codes. Uniscribe must first operate on character codes. In the process you're describing (if I have understood it), Uniscribe and the OT layout engine may look at a glyph with an ID of 0xE001 (or maybe a glyph that is encoded in the cmap as U+E001, but that is completely irrelevant since at this point it doesn't matter if it is even encoded in the cmap or not), but they never look at character U+E001; at the point you are referring to, a transformation to glyph space has already occurred.
If we start with data containing < 0995, 09CD, 09A4 >, Uniscribe will reorder where needed, do the cmap lookup to get the initial glyph IDs and then apply certain feature tags. From there it will start processing lookups on the tagged string of glyph IDs. For the situation that was asked about, though, you need to consider a scenario in which you start with data containing something like < E001, 09CD, 09A4 >. In that situation, I believe Unscribe will not operate on U+E001.
- Peter
---------------------------------------------------------------------------
Peter Constable
Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>
- FW: Private Use Area - Building Combining Classes Magda Danish (Unicode)
- RE: Private Use Area - Building Combining Clas... Christopher J Fynn
- Re: FW: Private Use Area - Building Combining ... Peter_Constable
- Re: FW: Private Use Area - Building Combining ... Peter_Constable
- Re: FW: Private Use Area - Building Combining ... Peter_Constable

