Hi, thank you for your answer :) Each character is about 100x160 pixels, is that too low ? I'll try with bigger ones and I'll post the results here
Le samedi 4 juillet 2015 04:10:18 UTC+2, Art Rhyno a écrit : > > Hi, > > > > I wonder if it has something to do with the sizing of the characters in > the image that you are using for font training. I swapped out the character > without the linked segments for a character in a set I am using and it > seemed to work ok. The set is too big for the list but I have attached the > image I used. > > > > art > > > > *From:* [email protected] <javascript:> [mailto: > [email protected] <javascript:>] *On Behalf Of *Pierre-Henri > DAUVERGNE > *Sent:* Friday, July 03, 2015 10:20 AM > *To:* [email protected] <javascript:> > *Subject:* [tesseract-ocr] Train tesseract for 14-segment display > > > > Hello everyone. > > I've posted on stackoverflow already but haven't had an answer yet ( > http://stackoverflow.com/questions/31131796/14-segment-display-and-tesseract-ocr-with-opencv > ). > > I'm looking for a way to accurately OCR 14-segment display. As you can see > in my SO thread, I trained tesseract with dilated characters which link all > of its segments together. My issue is that when I read from my webcam a > character, I have to erode it first to remove noise. After that, I dilate > it. > However, I can't do it enough to link all the segments together without > having issues with letters like 'B' and 'D' and the letter 'V' is not > recognized at all (I believe it is because of the space between the > diagonal being too long). > > · What I trained tesseract with (that's the "V" letter) : > http://i.imgur.com/NbmVqkb.png (segments are all linked) > > · What I feed tesseract with : http://i.imgur.com/0E4iXXk.png > (some segments are linked, some aren't) > > I tried to train tesseract with characters where all the segments aren't > linked but it says "Empty page !!". When I manually link the segments, the > training works fine (it feels weird that tesseract can't be trained with > blanck space inside characters since some of the existing languages (ie. > arabic or chineese) already have some). > > To bypass this issue, I've been trying different kind of image processing > algorithms (like thinning, in order to dilate "in height" but not in > "width") but gave more accurate results. > > Thank you for your help ! > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To post to this group, send email to [email protected] > <javascript:>. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/451dbd65-20b7-437a-8b5b-a0a726bdad06%40googlegroups.com > > <https://groups.google.com/d/msgid/tesseract-ocr/451dbd65-20b7-437a-8b5b-a0a726bdad06%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/71520219-dea6-4b99-ba76-bee71f7b11b6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

