Basically of 2000 characters about 1/4 of them fail this way. I've tried char_spacing from 1 all the way up to 10 doesn't seem to matter really. When we use the same text and print it with 1.25 spacing and manually make the tiff we only get 5-20 fails. Is there something I could be missing here? Is training data required for this step?
training/text2image --text=trainingText.txt --outputbase=eng.courier.exp0 --font='Courier New' --fonts_dir=/Library/Fonts/ --ptsize=14 --char_spacing=2.5 --degrade_image=0 Tom On Monday, August 25, 2014 8:00:30 AM UTC-5, Nick White wrote: > > On Fri, Aug 22, 2014 at 12:42:21PM -0700, Thomas Bruno wrote: > > Is this common when training from text2image output? > > > > > > APPLY_BOXES: boxfile line 5364/748 ((1488,893),(1532,6)): FAILURE! > Couldn't > > find a matching blob > > > > FAIL! > > Yes, there will be some of these. Check the proportion of failing to > not failing blobs is acceptable, and if not check out the > char_spacing argument for text2image. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1310b6dd-a00a-4ca4-a3e6-e68c9473ea6b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

