2011/9/14 Kent Karlsson <[email protected]>: > Because that stability guarantee says "The Bidi_Class property values will > not be further subdivided." I'm not too keen on the word "subdivided" here, > but it (here) means there will be *no additions* to the set of values for > the Bidi_class property. Not even for new characters. > > As far as I can tell, there is no restriction saying that the bidi algorithm > cannot look at code points as well as bidi category values.
That's absolutely not the way I understand it, notably if you consider the term "further", which references what was done before, where subsets of characters that were listed in the same class have been later partitioned into separate classes, before the policy was adopted. I am not advocating changing the existing bidi classes already assigned to characters, just that currently unassigned characters are already outside of any one of these classes. And how will you define what is an "implicit" LDM ? For example "1.2" have two interpretations (a single number with a decimal separator, fields of digits have a fixed relative order and the dot is part of that number; or a notation of two distinct numbers in fields separated by the dot, the fields being assumed to be displayed in the same direction as the embedding paragraph). Same thing about "31/12" (is it a date made of two fields to render in the embedding direction, or a fraction whose operands must NOT commute ?) As this is impossible to determine, I really think that in absence of markup, the existing CS class between two numbers should ALWAYS resove using the bidi class of these numbers (this means that "1.2" would always be considered as a single number, and "31/12" would always be a fraction). To change this meaning (and the expected rendering order if the embedding paragraph is RTL), there's only one way: isolate the numbers in LRE..PDF, so that it prohibits the propagation of their strong directionality to the separating character of class CS. The whole sequence "LRE(number)PDF" then externally has a weak direction, just like the surrounding CS characters, whitespaces, and other punctuations, and all these runs will need to take their direction from the embedding paragraph.

