I spent some more time after going through feedback I received here and to my personal mail. I tried to divide discussion in to bullet points. where ever I have answer I put it in questions and answers section, where ever I don't have answer I put it up in open issues section. The result is another post on my blog. I am pasting the same here for reference. I had gone through related mails on racchabanda and Unicode.org. References are given in the end.
This is a follow up post to my previous post titled Telugu Unicode Encoding Review (http://geek.chavakiran.com/archives/55 ) *1. Anybody using reserved code points for private use?* a. No AFAIK. I haven’t seen any such instances. But the reason may be lack of serious computing work in Telugu language, apart from whatever is happening on web and PC. Once people start using Unicode for mobile, once people start using Unicode for book publishing (now this area is dominated by dynamic encoding from Anu) we may get into this hell. *2. **ఌ** and **ౡ** (and other related places) are not assigned consecutive code points. Is this a problem for sorting? ** * *a.* The answer is No. As sorting is supposed to happen according to Unicode collation charts. In this charts, as shown in my previous post <http://geek.chavakiran.com/archives/55>the order looks OK. But if there is some encoding that is going to replace Unicode in future, I guess we better have them in order. This might save sorting time. More over whatever is not placed in consecutive code points, is rarely used in Telugu (see my post on Telugu character usage http://archives.chavakiran.com/?p=254 ) so just for the sake of these rarely used characters we are wasting un-necessary CPU time I guess. (FYI – not in order code points ఋ && ౠ , ఌ && ౡ, ౘ, ౙ, ళ, ఱ) . So for all practical purposed probably we may simply do a binary data sorting and move on! Of these characters only ~La ( ళ) seems to be used with good frequency. *3. Mr. Chava, you said "Telugu digits are not taught in school", does that mean they are un-necessarily present in Unicode encoding? * Hmmm... Not exactly. Even though the Telugu dits are not taught in school during my days, I guess now they are being taught in recent years. More over there are attempts to make people aware of them example now Hyderabad city buses contain numbers in both Telugu digits and Indo-Arabic numerals. And most important point is religious and classical Telugu books printed very recently also uses these numbers. For images see my previous blog post<http://geek.chavakiran.com/archives/55>images. My only point is font developers should feel free to have Indo Arabic numerals for Telugu digits also. *4. Mr. Chava, you said Current Telugu Unicode encoding is flawed, do you detest Unicode encoding? * No. I Love it for all the scenarios it enabled for Telugu people on digital life. I love it, that is why I am spending time over it. *5. Avagraha symbol is this encoded? * *Yes. \u0c3D * *6. Does OM (AUM) symbol need a code point in Telugu? * My personal opinion : No. Telugu Om is always a combination of 'O' and ~M. Unless I am missing something. Even on temples, calendars devanagari OM is used in Telugu land and where ever Telugu Om is used that is a simple combination of 'O' and '~M'. There may be one or two special cases but that must be artistic freedom, may not require a code point. * * *Open issues: * *1. Telugu danda and double danda are to be encoded. * (I saw some discussions of this here and there, but none conclusive. A decision made?) *2. How to encode something for musical Telugu books (for example a dot above character, a dot below character, a horizontal line above character, a dot just before the character)* *3. How to encode a Telugu script Vedic book? (For example a vertical line over character, A horizontal line below character)* Ansser? Do we need to use the code points from the vedic block? http://www.unicode.org/charts/PDF/U1CD0.pdf *4. Guruvu , Laguvu are to be encoded with new code points? * (Suggested by Suresh Kolichala in Racchabanda mailing list) *6. Yati symbol is to be encoded with new code point. * (Suggested by Suresh Kolichala in Racchabanda mailing list) *7. Is there any way to encode Tala kaTTu? * *8. Is there any way to encode ka ottu? (క్క , the second half of preceeding glyph). This is required to to encode a Telugu alphabets text book, where children were taught of ka ottu and then after few lessions they are taught about combining them with other vowels. The same question for all other ottulu. * *9. What are the pros and cons of new encoding scheme I proposed for Telugu script? (section 9 of my blog post <http://geek.chavakiran.com/archives/55>) Is this discussed somewhere?* *References* 1.http://groups.yahoo.com/group/racchabanda/message/15576 --> Discussion on tzh character in Telugu. 2. http://groups.yahoo.com/group/racchabanda/message/16367 RB mail after previous changes to Telugu Unicode. 3.http://groups.yahoo.com/group/racchabanda/message/16378 A discussion on musical symbols in Telugu. 4. http://unicode.org/alloc/nonapprovals.html Unapproval of arda visarga. 5. http://unicode.org/~emuller/southasia/vedic/ Encoding of Vedic. ---- నెనర్లు, కిరణ్ కుమార్ చావా http://te.chavakiran.com/blog http://en.chavakiran.com/blog 2010/10/17 Frédéric Grosshans <[email protected]> > Le samedi 16 octobre 2010 à 22:36 +0530, Kiran Kumar Chava a écrit : > > At the link, http://geek.chavakiran.com/archives/55 , I tried to > > understand Telugu Unicode encoding and then I tried to do an out of > > box review of this encoding. Kindly let me know if I am missing > > something, mentioned as missing in above article are really missing or > > not. Any other views... > > The 13 Telugu characters added in Unicode 5.1, including the fractions, > are enumerated here : > http://www.unicode.org/charts/PDF/Unicode-5.1/U51-0C00.pdf . > > The rationale for their inclusion are documented in > http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3116.pdf (which proposed 18 > characters) . I have not looked close enough to check whether the 5 > "missing" characters are linked to the one you consider as missing. > > Frédéric > > -- > Frédéric Grosshans > Chargé de Recherche > Laboratoire de Photonique Quantique et Moléculaire > ENS Cachan / CNRS UMR 8437 > tel: (+33)1 47 40 77 15 > GSM: (+33)6 09 24 29 64 > e-mail: [email protected] > >

