To get best training result use withe background and black text.

On 7 June 2012 06:11, cchhsu <[email protected]> wrote:

> I'm new to tesseract. I'm trying to train a font using the following
> instructions.
> (http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3)
>
> I had made the box file successfully.
> - tesseract moe.calibri.exp0.tif moe.calibri.exp0 -l eng batch.nochop
> makebox
> - Using the jTessBoxEditor externel tool to coordinate the box file.
>
> Then, I'm trying to generate a box.train file by using this command
> C:\Program Files\Tesseract-OCR\001>tesseract moe.calibri.exp0.tif
> moe.calibri.ex
> p0 nobatch box.train
>
> I get the unexpected output:
> ---
> Tesseract Open Source OCR Engine v3.01 with Leptonica
> Page 0
> APPLY_BOXES: boxfile line 3/r ((28,201),(39,223)): FAILURE! Couldn't
> find a matc
> hing blob
> APPLY_BOXES: boxfile line 4/e ((44,201),(59,223)): FAILURE! Couldn't
> find a matc
> hing blob
> APPLY_BOXES: boxfile line 5/m ((65,201),(93,223)): FAILURE! Couldn't
> find a matc
> hing blob
> APPLY_BOXES: boxfile line 6/i ((96,202),(103,231)): FAILURE! Couldn't
> find a mat
> ching blob
> APPLY_BOXES: boxfile line 7/u ((107,201),(125,224)): FAILURE! Couldn't
> find a ma
> tching blob
> APPLY_BOXES: boxfile line 8/m ((129,201),(158,224)): FAILURE! Couldn't
> find a ma
> tching blob
> APPLY_BOXES: boxfile line 9/G ((318,201),(337,231)): FAILURE! Couldn't
> find a ma
> tching blob
> APPLY_BOXES: boxfile line 10/e ((343,202),(357,223)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 11/n ((365,202),(379,223)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 12/r ((386,202),(396,223)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 13/e ((401,202),(415,223)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 14/S ((574,201),(589,231)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 15/e ((592,201),(607,223)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 16/n ((614,201),(629,223)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 17/s ((636,201),(647,223)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 18/M ((651,202),(680,231)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 19/e ((684,201),(700,223)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 20/T ((704,215),(713,227)): FAILURE!
> Couldn't find a m
> atching blob
> APPLY_BOXES: boxfile line 21/M ((716,215),(728,227)): FAILURE!
> Couldn't find a m
> atching blob
> Box file format error on line 1; ignored
> APPLY_BOXES:
>   Boxes read from boxfile:      33
>   Boxes failed resegmentation:      19
>   Found 14 good blobs and 0 unlabelled blobs in 0 words.
>   0 remaining unlabelled words deleted.
> TRAINING ... Font name = calibri
> LearnBLob: CharDesc was NULL. Aborting.
> Generated training data for 3 words
>
> C:\Program Files\Tesseract-OCR\001>
>
>
> After getting these unexpected output, I'm trying to two solution to
> solve it.
> 1. Change the hue/ contrast value in image file and then, execute the
> box.train command again.
> I get the less errors.
> 2. Recover the box file to initlize status (it means that I made the
> box file without change any value)
>   Then, I can generate the box.train file successfully.
>   -  I used the JTessBoxEditor to open the box file to verify.
>      I found the black characters with white background can be
> identified. But, the gray characters with black background are hard to
> identify.
>
> I had been uploaded the reference file on my skydrive space.
> You can download it if you interested. (http://sdrv.ms/MIOzZF)
>
> But, I don't know how to solve this issue. Please help me to solve
> this issue.
> Thanks.
>
>
> tesseract 3.01
> OS : Windows XP
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to