Re: [Ankur-core] Bangla OCR progress

Debayan Banerjee Sun, 19 Apr 2009 06:17:27 -0700

Dear Salahuddin,
>
>
>  I was working with OCR for my university. I took most of the idea
> from bocra.sourceforge.net
>
> It is written using graphicsmagick library & C++.  Any suggestion from
> you about matching alphabet.


You now need a recogniser. You could use a neural network library or
an adaptive classifier. Tesseract-OCR, the one I am trying to adapt,
used a neural net named aspirine/migraine previously and then switched
to a nearest-neighbour based adaptive classifier engine. This switch
was made due to licensing issues with aspirine i believe.
The challenge ofcourse is not to build a recogniser, since you can use
one of the available ones. The challenge is to gather sufficient
training data, or better yet, create a tool that automatically
generates training data  (given a font name and size) for this OCR
system using image rendering in a matter of seconds.
I have been  trying to do it but my initial approach was wrong.
However I believe I now know the correct approach.
Kindly go through http://hacking-tesseract.blogspot.com/.
>


-- 
Be Intelligent, Use GNU/Linux

http://debayanin.googlepages.com/
http://debayan.wordpress.com
http://lug.nitdgp.ac.in

------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
Bengalinux-core mailing list
Bengalinux-core@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bengalinux-core

Re: [Ankur-core] Bangla OCR progress

Reply via email to