[Avatar] 2011-02-10 14:34:32 EST The log file below is the result of training with an image containing "slashed" zeros (zero with a diagonal line in it to differentiate it from Upper-case O.)
If I edit out the diagonal, there are no errors in tesseract.log, but interpretation of zero and O are unreliable, even with a line in eng.unicharambigs. How can I get tesseract to accept the slashed zero? So far I have converted the image to black text on white background and scaled up to approx. 300 dpi. ----------------- tesseract.log ------------------------------------- Found fonts: ['IA'] Tesseract Open Source OCR Engine with Leptonica APPLY_BOXES: boxfile 1/51/0 ((2295,326),(2323,370)): FAILURE! box overlaps no bl obs or blobs in multiple rows APPLY_BOXES: boxfile 3/51/0 ((2289,137),(2317,181)): FAILURE! box overlaps no bl obs or blobs in multiple rows APPLY_BOXES: More than one block?? APPLY_BOXES: FATALITY - 0 labelled samples of "0 [30 ]" - target is 2: APPLY_BOXES: Boxes read from boxfile: 226 Initially labelled blobs: 224 in 4 rows Box failures detected: 2 Duped blobs for rebalance: 0 "0" has fewest samples: 0 Total unlabelled words: 0 Final labelled words: 224 Generating training data TRAINING ... Font name = IA Generated training data for 224 blobs See tif image at: http://www.flickr.com/photos/59351419@N05/5434403800/ -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

