try to set edges_use_new_outline_complexity to True (see box.train2 in attached test case).
Zdenko On Thu, Feb 10, 2011 at 8:48 PM, [email protected] <[email protected]>wrote: > [Avatar] > 2011-02-10 14:34:32 EST > The log file below is the result of training with an image containing > "slashed" zeros (zero with a diagonal line in it to differentiate it > from Upper-case O.) > > If I edit out the diagonal, there are no errors in tesseract.log, but > interpretation of zero and O are unreliable, even with a line in > eng.unicharambigs. > > How can I get tesseract to accept the slashed zero? So far I have > converted the image to black text on white background and scaled up to > approx. 300 dpi. > > ----------------- tesseract.log > ------------------------------------- > Found fonts: ['IA'] > Tesseract Open Source OCR Engine with Leptonica > APPLY_BOXES: boxfile 1/51/0 ((2295,326),(2323,370)): FAILURE! box > overlaps no bl obs or blobs in multiple rows > APPLY_BOXES: boxfile 3/51/0 ((2289,137),(2317,181)): FAILURE! box > overlaps no bl obs or blobs in multiple rows > APPLY_BOXES: More than one block?? > APPLY_BOXES: FATALITY - 0 labelled samples of "0 [30 ]" - target is > 2: > APPLY_BOXES: Boxes read from boxfile: 226 > Initially labelled blobs: 224 in 4 rows > Box failures detected: 2 > Duped blobs for rebalance: 0 > "0" has fewest samples: 0 > Total unlabelled words: 0 > Final labelled words: 224 > Generating training data TRAINING ... Font name = IA > Generated training data for 224 blobs > > > See tif image at: http://www.flickr.com/photos/59351419@N05/5434403800/ > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
test_case.tar.gz
Description: GNU Zip compressed data

