Please delete the following box file and try again using the same commandline. This time you must have fewest sample. You may succeeded. tif file is OK now.
APPLY_BOXES: boxfile 1/2/C ((919,900),(1047,1120)): FAILURE! box overlaps blob in labelled word APPLY_BOXES: ALSO ignoring corrupted char blk:1 row:1 "C" APPLY_BOXES: boxfile 1/4/5 ((1276,907),(1401,1122)): FAILURE! box overlaps blob in labelled word APPLY_BOXES: ALSO ignoring corrupted char blk:1 row:1 "1" APPLY_BOXES: boxfile 1/6/B ((1539,904),(1666,1122)): FAILURE! box overlaps blob in labelled word On Wed, May 12, 2010 at 3:55 PM, faster589 <[email protected]> wrote: > OK! I have resample the image in 300x300 and binarized it, but when > training tesseract obtain this log: > > Tesseract Open Source OCR Engine > Image has 1 * 1 bit per pixel, and size (2667,2000) > Resolution=300 > APPLY_BOXES: boxfile 1/2/C ((919,900),(1047,1120)): FAILURE! box > overlaps blob in labelled word > APPLY_BOXES: ALSO ignoring corrupted char blk:1 row:1 "C" > APPLY_BOXES: boxfile 1/4/5 ((1276,907),(1401,1122)): FAILURE! box > overlaps blob in labelled word > APPLY_BOXES: ALSO ignoring corrupted char blk:1 row:1 "1" > APPLY_BOXES: boxfile 1/6/B ((1539,904),(1666,1122)): FAILURE! box > overlaps blob in labelled word > APPLY_BOXES: ALSO ignoring corrupted char blk:1 row:1 "4" > APPLY_BOXES: FATALITY - 0 labelled samples of "C" - target is 2: > C:[43] > APPLY_BOXES: FATALITY - 0 labelled samples of "1" - target is 1: > 1:[31] > APPLY_BOXES: FATALITY - 0 labelled samples of "5" - target is 1: > 5:[35] > APPLY_BOXES: FATALITY - 0 labelled samples of "4" - target is 1: > 4:[34] > APPLY_BOXES: FATALITY - 0 labelled samples of "B" - target is 1: > B:[42] > APPLY_BOXES: > Boxes read from boxfile: 7 > Initially labelled blobs: 1 in 1 rows > Box failures detected: 6 > Duped blobs for rebalance: 0 > "C" has fewest samples: 0 > Total unlabelled words: 0 > Final labelled words: 1 > Generating training data > > box overlaps?why? i have use a python script to find a box! what's the > problem now? > this is the two file: > > http://groups.google.com/group/tesseract-ocr/web/t1%20%282%29.tif > http://groups.google.com/group/tesseract-ocr/web/t1%20%282%29.box > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

