try to set edges_use_new_outline_complexity to  True (see box.train2 in
attached test case).

Zdenko


On Thu, Feb 10, 2011 at 8:48 PM, [email protected] <[email protected]>wrote:

> [Avatar]
> 2011-02-10 14:34:32 EST
> The log file below is the result of training with an image containing
> "slashed" zeros (zero with a diagonal line in it to differentiate it
> from Upper-case O.)
>
> If I edit out the diagonal, there are no errors in tesseract.log, but
> interpretation of zero and O are unreliable, even with a line in
> eng.unicharambigs.
>
> How can I get tesseract to accept the slashed zero? So far I have
> converted the image to black text on white background and scaled up to
> approx. 300 dpi.
>
>  ----------------- tesseract.log
> -------------------------------------
> Found fonts: ['IA']
> Tesseract Open Source OCR Engine with Leptonica
> APPLY_BOXES: boxfile 1/51/0 ((2295,326),(2323,370)): FAILURE! box
> overlaps no bl obs or blobs in multiple rows
> APPLY_BOXES: boxfile 3/51/0 ((2289,137),(2317,181)): FAILURE! box
> overlaps no bl obs or blobs in multiple rows
> APPLY_BOXES: More than one block??
> APPLY_BOXES: FATALITY - 0 labelled samples of "0 [30 ]" - target is
> 2:
> APPLY_BOXES: Boxes read from boxfile: 226
> Initially labelled blobs: 224 in 4 rows
> Box failures detected: 2
> Duped blobs for rebalance: 0
> "0" has fewest samples: 0
> Total unlabelled words: 0
> Final labelled words: 224
> Generating training data TRAINING ... Font name = IA
> Generated training data for 224 blobs
>
>
> See tif image at: http://www.flickr.com/photos/59351419@N05/5434403800/
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Attachment: test_case.tar.gz
Description: GNU Zip compressed data

Reply via email to