Hi
I am trying to train tesseract for some digit images extracted from
video. I have been successful with training images whenever the box
files are correctly generated but sometimes even for very clear
images, theres no box file generated. I am uploading 2 images (one is
the digit 1 and the other is the digit 0) - I get a clear box markup
for 1 but nothing for 0. I am at a loss as to the difference between
these two images. I tried creating my own box files for the images but
of course, that doesn't work either as seen below. If anyone has any
suggestions as to why the box recognition for the image "0" does not
work, I would be extremely grateful.
I am using tesseract on an XP machine via a cygwin installation.
The files uploaded are named "t1score.tif" and "t2score.tif"
respectively.

$ /usr/local/bin/tesseract.exe t2score.tif junk nobatch box.train
Tesseract Open Source OCR Engine
Image has 8 bits per pixel and size (44,34)
Box file format error on line 2 ignored
APPLY_BOXES: FATALITY - 0 labelled samples of "0" - target is 1
APPLY_BOXES:
   Boxes read from boxfile:       1
   Initially labelled blobs:      0 in 0 rows
   Box failures detected:                    1
   Duped blobs for rebalance:     0
   "0" has fewest samples:     0
                                Total unlabelled words:        0
                                Final labelled words:          0
Generating training data
Generated training data for 0 blobs
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to