Query regarding Bangla OCR

Debayan Banerjee Tue, 11 Nov 2008 13:02:43 -0800

Dear Hasnat,
It feels great to see bangla training data on the tesseract ocr downloads
page. I have worked in the same direction for some time now, just out of
interest. I am in my final year of undergrad studies. Progress made so far
is documented at http://debayanin.googlepages.com/hackingtesseract and the
project is hosted at http://code.google.com/p/tesseractindic. Infact you may
already be aware of my work if you have done some googling.
Your training data is pretty comprehensive and is very helpful. However, it
is giving mediocre results when used with my code for segmentation. I havent
yet tried your work, but will do soon.
The latest work i have done can be done via svn checkout:
svn checkout 
*http*://tesseractindic.googlecode.com/svn/trunk/tesseractindic-read-only


My question is, you have created training data for Tesseract, but the
current Tesseract code does not support Bengali. So it doesnt matter right
now if you have made training data. But you can test it using my code, and
ofcourse your code.
I will work on this for atleast the next 6 months. So tell me what your
plans are so i can join the effort.

-- 
BE INTELLIGENT, USE LINUX
http://lug.nitdgp.ac.in
http://mukti09.in
http://planet-india.randomink.org

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Query regarding Bangla OCR

Reply via email to