On Sun, 17 Feb 2013 10:12:26 -0800 Asmus Freytag <[email protected]> wrote:
> On 2/17/2013 8:20 AM, Richard Wordingham wrote: > > Is there any guarantee that U+E4567 will not have a > > canonical decomposition mapping to <U+0F73 TIBETAN VOWEL SIGN II, > > U+E4568>? If so, where is it published? I thought we had guarantees > > that new canonical decompositions to non-starters would not be > > created (to <U+0F71, U+0F72, U+E4568> in this case), but I cannot > > find it. This conceivable decomposition mapping appears to wriggle > > through a loophole because U+0F73 is a starter, i.e. has canonical > > combining class 0. > Let me see whether I follow that. > If you encode a new character, it can have decomposition only if that > decomposition also contains at least one new character. (Remember, > all decompositions are defined to be pairs, except when they are > singletons. If a one-t0-many mapping is desired, enough intermediate, > partially composed characters must exist to allow this longer mapping > to be represented as a chain of simpler mappings.) Neither U+E4567 or U+E4568 has yet been assigned, so that does not preclude the decomposition. If the new character is the first one, it has to be a starter - http://www.unicode.org/policies/stability_policy.html 'Property Value Stability', Version 2.1.0+ - and so does the first value in its canonical decomposition mapping. The only way for a new canonical decomposition to start with a non-starter is for the first element of an intermediate decomposition mapping to be one of U+0F73, U+0F75 and U+0F81. > Now, does it make a difference whether that required new character in > the decomposition is the first or the second? And if it does, > can one point to a stability guarantee where that is expressed? > Is that what you are asking? No. I am trying to confirm that there will never be any character but U+0344, U+0F73, U+0F75 and U+0F81 that has a non-singleton canonical decomposition to non-starters. The only way I see can for that to happen is a decomposition via one of U+0F73, U+0F75 and U+0F81 such as from U+E4567 to <U+0F73, U+E4568>, and I cannot see where this is prohibited. Richard.

