Kannada chars are essentially is built up of more components and in many 
fonts these components do not touch and leave a gap. Another feature is 
inherent separation in chars like ಕೀ ಕೇ ಕೋ and in compound 
characters. Tesseract does not treat what is inside a box as one char 
but recognizes as more than one. And in such cases the result is distorted 
o/p. Developers may kindly modify codes in tesseract, so that it treats 
 what is inside a box used for training any language file, as one char and 
should not be split in o/p, even if there is a gap for internal  vertical 
scanning by tesseract  

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to