http://www.ijcaonline.org/proceedings/icdcit2014/number1/14381-1306

http://research.ijcaonline.org/icdcit2014/number1/icdcit1306.pdf

"Mamata Nayak and Ajit Kumar Nayak. Article: Odia Characters Recognition by 
Training Tesseract OCR Engine. *IJCA Proceedings on International 
Conference on Distributed Computing and Internet Technology 2014* 
ICDCIT-2014:25-30, 
December 2013. Published by Foundation of Computer Science, New York, USA. 
BibTeX
Abstract

Development of Optical Character Recognition (OCR) for an Indian script is 
an active area of research today. The presence of a large number of letters 
in the alphabet set, their sophisticated combinations and the complicated 
grapheme's they formed is a great challenge to an OCR designer. There are 
many application areas where, OCR can be used like, preserving old 
documents in electronics format, helping visually impaired persons to know 
the content of a document by transforming into speech, saving document 
images within limited space, making a electronic dictionary of words, 
preserving the ancient characters those are not included in the current set 
of characters of a language and many more. Currently, Tesseract, an open 
source OCR engine is considered as one of the most accurate FOSS OCR 
engines. Tesseract has already been designed to recognizing English, 
Italian, French, German, Spanish and Dutch and many more [11], as well as 
for few Indian languages such as Bengali, Tamil, Telugu, Malayalam. 
Similarly, Tesseract can be made to recognize other scripts if the engine 
can be trained with the requisite data. The objective of this work is to 
develop a training process for Tesseract OCR engine such that the engine 
will be capable of recognizing printed documents of Odia language used in 
the state of Odisha (formerly known as Orissa), India."

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/95b40b90-4b09-4b65-b67a-0f43fcd9dc6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to