Jonathan Coxhead asked: > >>Then could/should we use the sequence <200C, 062D, 20DD, 200C>? > > > > > > You *could* use that sequence, and if your rendering implementation > > were sophisticated enough, it *might* render what you were > > expecting. > > So here's my question ... > > If I did write the sequence <200C, 062D, 20DD, 200C>, would > *should* I expect? > > It seems to me that---barring bugs---this ought to produce the symbol > expected, in a completely standard-conforming way, and with no extra encoding > needed.
Well, *assuming* that you are dealing with a Unicode 4.1 (or subsequent) implementation of Arabic and bidi that has been updated to the Unicode 4.1 data files (so that U+20DD is jt=Transparent), and *assuming* you have access to a font that can actually represent a circle around U+062D legibly, yes, you should expect to see a circled HAH. Note that <200C, 062D, 200C, 20DD> should *also* produce the same visual rendering, but is not canonically equivalent to the first sequence. So that is *another* fly in the ointment. > > If I write <200C, 062D, 20DD, 200C>, and I don't see this Saudi > copyright > sign, shouldn't I be able to complain to someone for non-compliance? No, not if the renderer you are using doesn't claim to "interpret" U+20DD for rendering. > (Of course, > I might not like its baseline, or size, or stroke-width, but I'm sure > I could > get over it.) > > Exactly what "wiggle-room" exists, in the current state of play? We can all figure out what things ought to be like, but this is a very murky area for implementations -- behavior of combining enclosing marks, which have never been very well defined themselves, in combination with orthogonal format control characters whose implementation is itself complex. Rather than engage in thought experiments about who we could blame for being in non-compliance if some weird sequence doesn't display as we expect, in this case it is much more straightforward to just encode the symbol in question and be done with it. That was essentially the argument that carried the day for other complex symbols such as U+FDFD ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM. --Ken

