Somnath Kundu <skundu at cal dot vsnl dot net dot in> wrote:
> It appears that Bengali consonant "khanda ta" is not included in > Unicode Standard 3.0 for Bengali script. "Khanda ta" is the halant > form of "ta" (09A4) but it is considered a distinct consonant in > Bengali script. It comes between 09DF and 0982, looks like the > character 096F and is pronounced as 't', i.e., without the inherent > vowel 'a' in 'ta'. There are many common words in Bengali that uses > this consonant. > > Can someone shed some light on why it was not included in the Unicode > Standard and how Unicode Consortium intend to support it? I am asking > this question because I see that there is problem supporting it as a > combination of 09A4+09CD because it is used to create half form of > 'ta'. It is also not in the list of proposed characters. > > Keenly awaiting for any reply, Indic scripts aren't exactly my strong point, but in the interest of providing *any* reply... "Khanda-ta" appears to be "ta" with the inherent "a" killed. That would normally point to the use of "ta" (U+09A4) followed by the Bengali virama (U+09CD). If this sequence results in a "half ta" glyph which is different from khanda-ta, then the sequence ta + virama + ZWJ (U+200D) should be used instead. Many "characters" or character forms in Unicode, especially in Indic and other complex scripts, are implemented as sequences involving combining marks such as the virama, ZWJ, and ZWNJ. Also note that the concept "comes between" has to do with collation, which is language-dependent and not related to Unicode code point order. Now I have a question for the true Unicode/Indic experts: This mailing list gets a LOT of questions asking why Indic half-consonants and other forms (such as khanda-ta) aren't separately encoded in Unicode. The Unicode model for Indic scripts is supposedly based on ISCII-1988. How were these problems handled in ISCII? Do users of ISCII have the same problems? Are there significant differences between the ISCII and Unicode approach to these issues, and if so, should Unicode spell out more explicitly what those differences are? (The FAQ talks rather generally about "in some cases" and "in other cases.") Or are these questions being asked by people who have previously used ASCII-hacked font solutions instead of ISCII? -Doug Ewell Fullerton, California

