Hi everyone,

Here is my list of objectives for the meet.

Objectives/Deliverables for Indic Meet

I shall first demonstrate the working of the OCR on some sample
images. Then I plan to explain the working of the OCR system on a
higher level. It shall be followed by a demonstration of the problems
that exist in the present system and potential solutions that I have
in mind. I shall demonstrate how to train this OCR for a particular
language. This should be over in 75 minutes.
Then we move on to the problems I am facing. We have a discussion on
possible solutions. Here are a few problems to tackle:

1) Learning about the various efforts made in the past. BOCRA / Aksharbodh etc
2) Dealing with the post-OCR spell-checker problem
3) A better segmentation algorithm. Ocropus Curved cut segmenter.
Merits/demerits
3) Reducing number of character classes to be trained as explained at
http://hacking-tesseract.blogspot.com/2009/05/bengali-stats.html
4) Talk to Santhosh Thottingal about integrating the service to Silpa
5) How to build a web interface that can train the OCR engine from user input.

Taken from 
http://hacking-tesseract.blogspot.com/2009/05/issues-for-indic-meet.html
-- 
Regards,
Debayan Banerjee

Support Free Software
http://deeproot.in

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
IndLinux-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/indlinux-group

Reply via email to