I've just finished setting up a section at www.CustomOCR.com dedicated to community supported languages. Link: http://www.customocr.com/index.php?r=site/page&view=demos.experimental
Let it begin with Kannada. Credits to Sriranga - he provided the traineddata file 2 Sriranga. I've tested it with some images and of course noticed quite a few recognition errors. You told about postprocessing, but I'd rather devise a preprocessor for Kannada - many errors are caused by improper Tesseract's segmentation of disjoint or multipart characters. Also I'd like to have more images with real (not training) Kannada text, to test the OCR and to place them on the website as samples. Warm regards, Dmitri Silaev www.CustomOCR.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

