[tesseract-ocr] Train tesseract for 14-segment display

Pierre-Henri DAUVERGNE Fri, 03 Jul 2015 07:21:02 -0700

Hello everyone.

I've posted on stackoverflow already but haven't had an answer yet 
(http://stackoverflow.com/questions/31131796/14-segment-display-and-tesseract-ocr-with-opencv).

I'm looking for a way to accurately OCR 14-segment display. As you can see
in my SO thread, I trained tesseract with dilated characters which link all
of its segments together. My issue is that when I read from my webcam a
character, I have to erode it first to remove noise. After that, I dilate
it.
However, I can't do it enough to link all the segments together without
having issues with letters like 'B' and 'D' and the letter 'V' is not
recognized at all (I believe it is because of the space between the
diagonal being too long).

What I trained tesseract with (that's the "V" letter) :
http://i.imgur.com/NbmVqkb.png (segments are all linked)
-

What I feed tesseract with : http://i.imgur.com/0E4iXXk.png (some
segments are linked, some aren't)

I tried to train tesseract with characters where all the segments aren't
linked but it says "Empty page !!". When I manually link the segments, the
training works fine (it feels weird that tesseract can't be trained with
blanck space inside characters since some of the existing languages (ie.
arabic or chineese) already have some).

To bypass this issue, I've been trying different kind of image processing
algorithms (like thinning, in order to dilate "in height" but not in
"width") but gave more accurate results.

Thank you for your help !

--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/451dbd65-20b7-437a-8b5b-a0a726bdad06%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Train tesseract for 14-segment display

Reply via email to