Joan Wardell wrote: > Ken: Speaking for Sybase products, "fixing" the combining classes of the > existing vowels would have *no* positive impacts. It would have > a large number of negative impacts, the ultimate ramifications > of which I cannot even follow to their eventual conclusions. ... > > I hope you will excuse my ignorance, but I do not understand how correcting > the canonical classes is such a huge technical problem. If anyone has > already normalized their biblical Hebrew data, they have trashed it, and it > will have to be re-done anyway.
That is besides the point for the implementations I'm talking about, actually. > Secondly, the Character Properties would > appear to be one huge matrix which would be called by any software needing > to know these. It isn't. > Why can't we just fix the database? :) Because changing the canonical ordering classes (in ways not allowed by the stability policies) breaks the normalization *algorithm* and the expected test results it is tested against. > I am completely ignorant of the mechanics of sorting algorithms and > whatever types of software are required to implement canonical classes. > However I can tell you it is no small thing to write in some kind of > intelligence in every future keyboard, conversion table, and search engine > for Hebrew just to identify how to undo "Yerushali-am". And having to trick > every browser is no small feat either. And that is only one exception of > the many that have been discussed. The CGJ proposal doesn't involving tricking out anybody's display, if done correctly. And I'm not talking about "undoing" normalized i-a sequences that have the incorrect order. As you noted above, any such data is trashed already and will have to be "re-done anyway". I don't see where conversion tables are at issue here. No character mapping for Unicode is involved or changed by this. Unless you are talking about conversion algorithms for batch conversion of existing Biblical Hebrew repositories into Unicode -- but those are specialized code to begin with, and it is much less impact to ask people to update the tables in those to insert a CGJ into the point sequences than it is to ask all implementers to deal with the consequences of broken normalization. And I don't think you have thought through the consequences, for Biblical Hebrew itself, of having inconsistent normalization implementations (pre-fix and post-fix) floating around. Those will impact precisely the data you are trying to fix here, in ways that will force precisely the kinds of fixes in applications and search engines that you are worried about avoiding. --Ken