Dear Hasnat, It feels great to see bangla training data on the tesseract ocr downloads page. I have worked in the same direction for some time now, just out of interest. I am in my final year of undergrad studies. Progress made so far is documented at http://debayanin.googlepages.com/hackingtesseract and the project is hosted at http://code.google.com/p/tesseractindic. Infact you may already be aware of my work if you have done some googling. Your training data is pretty comprehensive and is very helpful. However, it is giving mediocre results when used with my code for segmentation. I havent yet tried your work, but will do soon. The latest work i have done can be done via svn checkout: svn checkout *http*://tesseractindic.googlecode.com/svn/trunk/tesseractindic-read-only
My question is, you have created training data for Tesseract, but the current Tesseract code does not support Bengali. So it doesnt matter right now if you have made training data. But you can test it using my code, and ofcourse your code. I will work on this for atleast the next 6 months. So tell me what your plans are so i can join the effort. -- BE INTELLIGENT, USE LINUX http://lug.nitdgp.ac.in http://mukti09.in http://planet-india.randomink.org --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

