Hi. > From: Mark Davis ☕ ([email protected]) > Date: Mon Jul 26 2010 - 14:13:22 CDT > I agree that having it stated at point of use is useful - and we do that in > other cases covered by stability clauses; but we can only state it IF we > have the corresponding stability policy.
> Mark > . . . >> On Mon, Jul 26, 2010 at 11:06, Asmus Freytag <[email protected]> wrote: >>> On 7/26/2010 6:55 AM, John Burger wrote: > >>> Mark Davis ☕ wrote: >> >>>> From just a quick scan, it appears that they are currently all contiguous >>>> within their respective groups. If we were to impose a stability policy, >>>> it >>>> would be a constraint on the general_category: we would not assign >>>> general_category=decimal_number to any character unless it was part of a >>>> contiguous range of 10 such characters with ascending values from 0..9. >>> >> While that is true for the properties, it's not true for the encoding of >> character that are *used* as decimal digits. Martin gave the most widely >> used counterexample. >> >> >>> >>> Whether such a policy makes sense, I'm not clear on why it would be called >>> a "stability" policy - the analogy to the existing such policies seems >>> strained at best. >>> >> There are two parts to this. >> >> One, and I think this is the more important part, is to have an encoding >> policy of not splitting up runs of decimal digits - which would include >> reserving a spot for a zero, in case, *over the lifetime of Unicode*, some >> script changes their use from numbers 1-9 to decimal digits. >> >> The other is a guarantee of what it means for a character to have the >> decimal digit property. >> >> My suggestion for handling this, differ a bit from what has been discussed >> so far. >> >> The first I would address by suitable language in the WG2 Principles and >> Procedures document. This is where policies on encoding are maintained. >> True, these policies do allow exceptions, but exceptions (note Han !) do >> exist, and if a similar case of mixed-use character came along, then they >> would have to be dealt with accordingly. What the P&P would do is remove the >> wrong notion that it is OK to scatter runs of known decimal digits when >> encoding new scripts. >> >> The second I would address not by a stability policy, but by clarity of >> definition of the property. Language such as: >> >> "A character is given the decimal digit property, if and only if, it is >> used in a decimal place-value notation and all 10 digits are encoded >> in a single unbroken run starting with the digit of value 0, in >> ascending >> order of magnitude". >> >> or equivalent would be quite sufficient. That language happens to be a much >> clearer statement of the *implicit* definition used in assigning this >> property than the language found in UAX#44 or Unicode Section 4.6. >> >> Having that language where the property is documented is much more useful >> and visible than in a stability policy. >> >> A./ I like this policy -- both parts of it -- but agree with Asmus that the first thing to do is define a decimal digit; that will rule out the characters such as Asmus has described where "> the same [alphabetic] characters > are also used as elements in a system that doesn't use place-value, but > uses special characters to show powers of 10. " (there is no reason for these not to be as contiguous as possible but these cannot be contiguous if they are alphabetic . . . and if there is no zero then reserving a space for the zero is a moot issue; also these are all encoded and I think we want the policy for future encodings only) there are other cases where characters do not use place value although they seem to be based on 10's 100's etc; a number of languages used | for 1 ; || for 2 ; ||| for 3 or something similar, and then have bundled multiples of 10 (many of these seem to be ancient languages . . . mostly it seems, and certainly there is no 0 and no need to reserve space for it; I've not gone through many character charts though so I can't really speak as an expert as you all can; sorry I've not gotten to more; I will try to (I have been looking some at my registries instead; long story). Best, C. E. Whitehead [email protected]

