Richard Wordingham asked: > Is the provisional property 'Indic_Syllabic_Category' defined by > anything deeper than the UCD file IndicSyllabicCategory itself?
Basically, no. It simply gathers together information scattered about in the core spec and elsewhere about claims regarding what all the characters are. The classification has been undergoing further review and will be updated again shortly for the 7.0 Unicode release with some further distinctions and corrections. However, the file(s) (and properties) will remain provisional for Unicode 7.0. And there is no overarching UTR which provides a definitive model for all of these categories. The values are evolving more along the lines of what is proving useful for implementation, rather than being a priori defined categories. > Is the property meant to be tailorable? For example, there are > encoded characters in the Khmer script that serve as tone marks when it > is used to write Thai. For a property to be "tailorable" in a Unicode context, you pretty much have to have some kind of algorithm defined which uses those property values and then changes them in some systematic way to modify the outcome of the algorithm. In this case, there is no Unicode algorithm defined (although implementers may have specific algorithms in their rendering engines), and the data is all provisional. There is a probability that the two Indic category files may be promoted to *informative* status as of Unicode 8.0, with further modifications, extensions, and corrections. The main difference would be that once a property becomes *informative* in the UCD, the UTC would be committed to keeping it around and maintaining it. By contrast, a provisional property can just be removed, if it doesn't pan out. My suggestion, for those who are interested in this topic, would be to review the relevant data files, implied script behaviors, and documents and proposals in the UTC document register -- and over the course of the next year participate in providing feedback on this topic and the data files, so that if/when the files and related properties become informative for Unicode 8.0 next year sometime, these questions and any concerns about the various edge cases as applied to Southeast Asian scripts, can be addressed before the properties become more difficult to update. --Ken _______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

