If we succeeded in Sanskrit(Deveanagari script) which is mother lang of Indic no doubt tesseract 3.01 should work. I have tested with Tamil which has dependent vowels identical to Bengali as well as Kannada, Telugu. Only problem is with output accuracy - which can be solved for, time being,with help of post processor , In FreeOCR latest version 4.1(July10) has post processor developed by the Ralph I found latest version tesseract.exe will work with freeOCR till today. Even vietOCR developed by Quan has post processsor feature which works for indic apart from viet.
With regards, -sriranga(78yrs) On Wed, Mar 30, 2011 at 11:47 AM, Debayan Banerjee <[email protected]>wrote: > Hi, > > I gather that Tesseract 3.0 works well for Chinese script now. The > hallmark of Chinese script is that it is unconnected (unlike say Hindi > which has a line connecting all its characters), and it has a large > number of characters in the alphabet. In this light, I think it should > also work well with unconnected Indic script such as Kannada, > Malayalam, Punjabi etc. > > Anyone know if this works? > > -- > Debayan Banerjee > http://hacking-tesseract.blogspot.com/ > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

