On 7/26/2010 6:55 AM, John Burger wrote:
Mark Davis ☕ wrote:

From just a quick scan, it appears that they are currently all contiguous within their respective groups. If we were to impose a stability policy, it would be a constraint on the general_category: we would not assign general_category=decimal_number to any character unless it was part of a contiguous range of 10 such characters with ascending values from 0..9.
While that is true for the properties, it's not true for the encoding of character that are *used* as decimal digits. Martin gave the most widely used counterexample.


Whether such a policy makes sense, I'm not clear on why it would be called a "stability" policy - the analogy to the existing such policies seems strained at best.

There are two parts to this.

One, and I think this is the more important part, is to have an encoding policy of not splitting up runs of decimal digits - which would include reserving a spot for a zero, in case, *over the lifetime of Unicode*, some script changes their use from numbers 1-9 to decimal digits.

The other is a guarantee of what it means for a character to have the decimal digit property.

My suggestion for handling this, differ a bit from what has been discussed so far.

The first I would address by suitable language in the WG2 Principles and Procedures document. This is where policies on encoding are maintained. True, these policies do allow exceptions, but exceptions (note Han !) do exist, and if a similar case of mixed-use character came along, then they would have to be dealt with accordingly. What the P&P would do is remove the wrong notion that it is OK to scatter runs of known decimal digits when encoding new scripts.

The second I would address not by a stability policy, but by clarity of definition of the property. Language such as:

   "A character is given the decimal digit property, if and only if, it is
    used in a decimal place-value notation and all 10 digits are encoded
in a single unbroken run starting with the digit of value 0, in ascending
   order of magnitude".

or equivalent would be quite sufficient. That language happens to be a much clearer statement of the *implicit* definition used in assigning this property than the language found in UAX#44 or Unicode Section 4.6.

Having that language where the property is documented is much more useful and visible than in a stability policy.

A./

Reply via email to