Hi Sabrina, At this <https://github.com/arturaugusto/display_ocr/tree/master/training_source> link I included a python script that helped me training tesseract, providing the .box and .tif image that contains the image with samples for the font.
I can't remember the details, since I did this work 1 year ago and never trained any other font. Do you already have sample images to the font you want to train? 2015-05-28 10:16 GMT-03:00 sabrina soraya <[email protected]>: > Hi Arthur, first of all I want to say thanks for sharing the trained data > files to us. But I found that the "7" digit data I have is different like > your trained data. My "7" digit has one more segment in the left top. So I > was thinking to train by myself. But I got problem when I follow training > instruction from here > https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 > especially in "Run Tesseract for Training" part. It did not give me *.tr > file. Can you give me a clear instruction how did you train your data? > Thank you in advance! > > Sabrina. > > On Friday, 4 July 2014 08:13:12 UTC+8, Artur wrote: >> >> Hi Nick, >> >> I've just pushed the training data to my project page! >> >> https://github.com/arturaugusto/display_ocr/tree/master/training_source >> >> If someone come with improvements as you told, I will be accepting pull >> requests. >> >> Artur >> >> >> >> 2014-07-03 18:47 GMT-03:00 Artur Augusto <[email protected]>: >> >>> And thats why I created a project that uses OpenCV so user can real time >>> control the erosion.. >>> >>> >>> 2014-07-03 18:45 GMT-03:00 Artur Augusto <[email protected]>: >>> >>> Sure, just need some time to compile all stuff in a more organized way >>>> and document it. >>>> >>>> I needed to apply some erosion to preprocess the font because of the >>>> problem to recognize segmented fonts. >>>> >>>> My trained data only works with erosion. >>>> >>>> I will do that as soon as I can. >>>> >>>> 2014-07-03 18:27 GMT-03:00 Nick White <[email protected]>: >>>> >>>> Hi Artur, >>>>> >>>>> On Wed, Jul 02, 2014 at 10:18:55PM -0300, Artur Augusto wrote: >>>>> > As many people ask about how to use tesseract to read 7 segments >>>>> display, I >>>>> > decided to publish an open source sample project. >>>>> > >>>>> > If someone wanna check it: >>>>> https://github.com/arturahttps://github.com/ >>>>> > arturaugusto/display_ocrugusto/display_ocr >>>>> >>>>> Awesome, thanks so much for sharing! I was about to add it to the >>>>> 3rdParty wiki page, but Zdenko beat me to it :) >>>>> >>>>> Can you share the source files for your training somewhere too >>>>> (image and box files), so people can potentially improve on / add to >>>>> the training themselves? >>>>> >>>>> Nick >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/20140703212755.GB19831%40manta.lan >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/73052430-a525-4bbd-a177-a15e0bc55a9b%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/73052430-a525-4bbd-a177-a15e0bc55a9b%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAGP33S5DBqaTj5trDbQ1G-asQs_qbMgcvv1eC9D-nE4CRtP1hQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

