Dear Tessearct users/developers,

 

I have problem using Tesseract to train a Chinese OCR. Examples are 
described as follows:

 

1. Empty page, even though the TIF is not empty and the box file is 
bounding the character tighthly

.\tesseract test.ming.24.tif test.ming.24 batch.nochop box.train

 

=== begin output ===

Tesseract Open Source OCR Engine v3.02 with Leptonica

Empty page!!

Empty page!!

=== end output ===

 

 

2.  Failed resegmentation (specifically tell that there is only one 
character)

.\tesseract test.ming.24.tif test.ming.24 -psm 10 batch.nochop box.train

 

=== begin output ===

Tesseract Open Source OCR Engine v3.02 with Leptonica

Bounding box=(16,23)->(28,32)

Bounding box=(16,15)->(28,24)

APPLY_BOXES: boxfile line 0/??((8,14),(36,41)): FAILURE! Couldn't find a 
matchin

g blob

APPLY_BOXES:

  Boxes read from boxfile:       1

  Boxes failed resegmentation:       1

APPLY_BOXES: Unlabelled word at :Bounding box=(16,15)->(28,32)

APPLY_BOXES: Unlabelled word at :Bounding box=(8,14)->(36,41)

   Found 0 good blobs.

   2 remaining unlabelled words deleted.

Generated training data for 0 words
 
=== end output ===

 

Anyone can help?

 

 

Regards,

W. K. Lo

 

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to