Gautam Sengupta wrote: > --- Marco Cimarosti wrote: > > OK but, then, your <ZWJ> becomes exactly what > > Unicode's <VIRAMA> has always > > been: [...] > > You are absolutely right. I am suggesting that the > language-specific viramas be retained as > script-specific *explicit* viramas that never > disappear. In addition, let's have a script-specific > ZWJ which behaves in the way you describe in the > preceding paragraph.
Good, good. We are making small steps forward. What are you really asking for is that each Indic script has *two* viramas: - a "soft virama", which is normally invisible and only displays visibly in special cases (no ligatures for that cluster); - a "hard virama" (or "explicit virama", as you correctly called it), which always displays as such and never ligates with adjacent characters. Let's assume that it would be handy to assign these two viramas to different keys on the keyboard. Or, even better, let's assign the "soft virama" to the plain key and the "hard virama" to the SHIFT key, OK? To avoid misunderstandings with the term "virama", let's label this key "JOINER". Now, this is what you *already* have in Unicode! On our hypothetic Bangla keyboard: - the "soft virama" (the plain JOINER key) is Unicode's <BENGALI SIGN VIRAMA>; - the "hard virama" (the SHIFT+JOINER key) is Unicode's <BENGALI SIGN VIRAMA>+<ZWNJ>. Not only Unicode allows all of the above, but it also has a third kind of "virama", which may or may not be useful in Bangla but is certainly useful in Devanagari and Gujarati: - the "half-consonant virama" (let's assign it to the ALT+JOINER key in out hypothetical keyboard) which forces the preceding consonant to be displayed as an half consonant, if possible. This is Unicode's <BENGALI SIGN VIRAMA>+<ZWJ>. Notice that, once you have these three "viramas" on your keyboard, you don't need to have keys for <ZWJ> and <ZWNJ>, as their only use, in Indic, is after a <xxx SIGN VIRAMA>. Apart the fact that two of the three viramas are encoded as a *pair* of code points, how does the *current* Unicode model impede you to implement the clean theoretical model that you have in mind? > [...] > > - independent and dependent vowels were the same > > characters; > [...] > > I agree with you on all of these issues. You have in > fact summed up my critique of the ISCII/Unicode model. OK. But are you sure that this critique should necessarily be moved to the *encoding* model, rather than to some other part of the chain. I'll now try to demonstrate how also the redundancy of dependent/independent vowels may be solved at the *keyboard* level. You are certainly aware that some national keyboards have the so-called "dead keys". A dead key is a key which does not immediately send (a) character(s) to the application but waits for a second key; in European keyboards dead keys are used to type accented letters. E.g., let's see how accented letters are typed on the Spanish keyboard (which, BTW, is by far the best designed keyboard in Western Europe): 1. If you press the <�> key, nothing is sent to the application, but the keystroke is memorized by the keyboard driver. 2. If you now press one of <a>, <e>, <i>, <o>, <u> or <y> keys, characters <�>, <�>, <�>, <�>, <�> or <�> are sent to the application. 3. If you press the space bar, character <�> itself is sent to the application; 4. If you press any other key, e.g. <m>, the two characters <�> and <m> are sent to the application in this order. Now, in the description above substitute: - the <�> key with <0985 BENGALI LETTER A> (but let's label it "VIRTUAL CONSONANT"); - the <a> ... <y> keys with <09BE BENGALI VOWEL SIGN AA> ... <09CC BENGALI VOWEL SIGN AU>; - the <�> ... <�> characters with <0986 BENGALI LETTER AA> ... <0994 BEGALI LETTER AU>. What you have is a Bangla keyboard where dependent vowels are typed with a single <vowel> keystroke, and independent vowels are typed with the sequence <VIRTUAL CONSONANT>+<vowel>. Do you prefer your <cons>+<VIRAMA>+<vowel> model? Personally, I find it is suboptimal, as it requires, on average, more keystrokes. However, if that's what you want, in the Spanish keyboard description above substitute: - the <�> key with the unshifted <JOINER> (= virama) key that we have already defined above; - the <a> ... <y> keys with <0986 BENGALI LETTER AA> ... <0994 BEGALI LETTER AU>; - the <�> ... <�> characters with <09BE BENGALI VOWEL SIGN AA> ... <09CC BENGALI VOWEL SIGN AU>. Now you have a Bangla keyboard where independent vowels are typed with a single keystroke, and dependent vowels are typed with the sequence <JOINER>+<vowel>. _ Marco

