Kenneth Whistler wrote on 06/26/2003 05:36:34 PM: > Why is making use of the existing behavior of existing characters > a "groanable kludge", if it has the desired effect and makes > the required distinctions in text?
Why is it a kludge to insert some cc=0 control character into the text for the sole purpose of preventing reordering during canonical ordering of two combining marks that do interact typographically and so should but nevertheless do not have the same combining class; and, moreover, to do so using a control character that was not created for that purpose? The answer seems so obvious, I wouldn't know how to begin responding. And the fact that it achieves some desired effect has no bearing on being described as a kludge -- every kludge achieves some desired effect. If it were otherwise, the given practice would never have been conceived. > But in the 10646 WG2 context, coming in with a duplicate set > of Hebrew points is not going to make any sense... > You can always come in > with the proposal to encode BIBLICAL HEBREW POINT PATAH and > say, even though the glyph is identical, see, the name is > different, so the character is different. But this is a pretty > thin disguise, and is vulnerable to simple questioning: > What is it for? Are we saying that ISO doesn't give a rip for implementation issues? Or that their notion of ordering distinctions is different from Unicode's such that *any* differently ordering permutation of some given set of characters is considered a distinct representation? Are we saying that the voting members of WG2 are not already aware of the issue that has been discussed and incapable of understanding an explanation of these issues addressed to them? > I'm trying to find a way, using existing characters and a > simple set of text representational conventions, to make > the distinctions and preserve the order relations that you > need for decent font lookup, without the whole enterprise > washing up on either of those two rocks. Understood. I wasn't expecting the surf to go off in this direction since I was under the impression when we discussed this back in December on unicoRe that there was a consensus that we should pursue just exactly what I wrote in the proposal. If we want to insert a control character to prevent reordering under canonical ordering, I think it would be preferable to create a new control character for just that purpose: that would give a character that could be used elsewhere for the very same purpose without needing to worry about what unanticipated and undesirable effects might result by hijacking a control created for some completely unrelated purpose. For instance, you suggested RLM. Suppose next week we discover a very similar issue in a LTR script; do we want to insert RLM to prevent mark reordering in that case? No! Do we want to be telling people to pick and choose from various controls, using different ones according to the directionality of the text? What if the base character is a neutral, or has selectable directionality (I'm thinking ahead to Tifinagh, which is written either LTR, or RTL)? Are we also going to introduce the use of PDF for this purpose in some contexts? How complicated to we want to make this? (Every time we conflate distinct functions on a single control character, we are inviting added complication, and are setting ourselves up for regrets. One might think that lesson was learned from the conflation of ZWNBSP and BOM. - Peter --------------------------------------------------------------------------- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485

