Hello, I'm working on mobile app which uses tesseract library for OCR. I trained tesseract for my own fonts but results are still very unstable. When I debug results it seems library recognizes letters correctly if boxes are found correctly. However, in many cases they are incorrect.
For preprocessing I'm using adaptive thresholding, which deals with pretty well. The common problems with boxes are: 1) detecting one character as two or vice versa 2) detecting very long but narrow boxes covering few lines 3) not detecting boxes How to improve boxes detection? Can I constrain their sizes or ratio? Any suggestions are appreciated. Mike -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2599fce4-947e-4093-bf01-f83e0945cfc8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

