In my OCR situation, Tesseract can not identify rows properly. Please see 
the attach box image below. (Blue squares are boxes found by Tesseract and 
red areas are marked as problematic area by me)

[image: qq 20160615195248] 
<https://cloud.githubusercontent.com/assets/3959938/16078734/118747c2-3333-11e6-890a-5eeaac13f217.png>

It seems that Tesseract is not able to find the baseline correctly when the 
row spacing is small and the image is a little skew --- two chars in two 
rows are mistakenly vertically merged. Therefore, the OCR quality in 
"crowded" space is really poor.

How could I imporve the OCR quality in this situation? Are there any params 
can be used here?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ed881af0-f4bf-42e1-8106-1bafd50f3702%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to