Re: [tesseract-ocr] Train tesseract for 14-segment display

Pierre-Henri DAUVERGNE Mon, 06 Jul 2015 00:19:01 -0700

Hi, thank you for your answer :)

Each character is about 100x160 pixels, is that too low ? I'll try with 
bigger ones and I'll post the results here


Le samedi 4 juillet 2015 04:10:18 UTC+2, Art Rhyno a écrit :
>
>  Hi,
>
>  
>
> I wonder if it has something to do with the sizing of the characters in 
> the image that you are using for font training. I swapped out the character 
> without the linked segments for a character in a set I am using and it 
> seemed to work ok. The set is too big for the list but I have attached the 
> image I used. 
>
>  
>
> art
>
>  
>
> *From:* [email protected] <javascript:> [mailto:
> [email protected] <javascript:>] *On Behalf Of *Pierre-Henri 
> DAUVERGNE
> *Sent:* Friday, July 03, 2015 10:20 AM
> *To:* [email protected] <javascript:>
> *Subject:* [tesseract-ocr] Train tesseract for 14-segment display
>
>  
>  
> Hello everyone.
>
> I've posted on stackoverflow already but haven't had an answer yet (
> http://stackoverflow.com/questions/31131796/14-segment-display-and-tesseract-ocr-with-opencv
> ).
>
> I'm looking for a way to accurately OCR 14-segment display. As you can see 
> in my SO thread, I trained tesseract with dilated characters which link all 
> of its segments together. My issue is that when I read from my webcam a 
> character, I have to erode it first to remove noise. After that, I dilate 
> it.
> However, I can't do it enough to link all the segments together without 
> having issues with letters like 'B' and 'D' and the letter 'V' is not 
> recognized at all (I believe it is because of the space between the 
> diagonal being too long).
>
> ·        What I trained tesseract with (that's the "V" letter) : 
> http://i.imgur.com/NbmVqkb.png (segments are all linked)
>
> ·        What I feed tesseract with : http://i.imgur.com/0E4iXXk.png 
> (some segments are linked, some aren't)
>
> I tried to train tesseract with characters where all the segments aren't 
> linked but it says "Empty page !!". When I manually link the segments, the 
> training works fine (it feels weird that tesseract can't be trained with 
> blanck space inside characters since some of the existing languages (ie. 
> arabic or chineese) already have some).
>
> To bypass this issue, I've been trying different kind of image processing 
> algorithms (like thinning, in order to dilate "in height" but not in 
> "width") but gave more accurate results.
>
> Thank you for your help !
>  
> -- 
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected] 
> <javascript:>.
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/tesseract-ocr/451dbd65-20b7-437a-8b5b-a0a726bdad06%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/tesseract-ocr/451dbd65-20b7-437a-8b5b-a0a726bdad06%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>  

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/71520219-dea6-4b99-ba76-bee71f7b11b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [tesseract-ocr] Train tesseract for 14-segment display

Reply via email to