The problem is probably that the textline finder is splitting your
characters over multiple lines. While it is not supposed to do this, it does
it sometimes. A fix to applybox is needed so it can still work in this
situation.Ray.

On Thu, May 14, 2009 at 11:26 PM, Raj <[email protected]> wrote:

>
> Hi frendz,
>
>  I'm working on Tesseract to recognize 7-segment display in C#
> Application.
>
> I have successfully trained the tesseract and it is working perfectly
> with the Digital Meter Images.
>
> But i have one problem.
>
> I have one more Digital Meter Image where the Digits/Numbers are
> broken/segmented i want the application to recoginze it also.so, when
> i tried to train the samples for the digits .,0,1,2,3,4,5,6,7,8,9,
> i'm getting only the digits "."(dot) and "7" in the "TR"  file and for
> the other numbers i got this messege in the tesseract text file.  But
> i have checked the image and in the Box file, the box's for each of
> the digits are perfect i.e coordinates of box of each digits is
> perfect.
>
>
> u can have a look at the sample image used for training tesseract at
> http://www.flickr.com/photos/30806...@n02/3532294181/sizes/l/
>
> Tesseract Open Source OCR Engine
> Image has 24 bits per pixel and size (966,520)
> Resolution=0
> APPLY_BOXES: FATALITY - 0 labelled samples of "0" - target is 10
> APPLY_BOXES: FATALITY - 0 labelled samples of "1" - target is 10
> APPLY_BOXES: FATALITY - 0 labelled samples of "2" - target is 10
> APPLY_BOXES: FATALITY - 0 labelled samples of "3" - target is 10
> APPLY_BOXES: FATALITY - 0 labelled samples of "4" - target is 10
> APPLY_BOXES: FATALITY - 0 labelled samples of "5" - target is 10
> APPLY_BOXES: FATALITY - 0 labelled samples of "6" - target is 10
> APPLY_BOXES: FATALITY - 0 labelled samples of "8" - target is 10
> APPLY_BOXES: FATALITY - 0 labelled samples of "9" - target is 10
> APPLY_BOXES:
>   Boxes read from boxfile:     108
>   Initially labelled blobs:     18 in 10 rows
>   Box failures detected:                   90
>   Duped blobs for rebalance:     0
>   "0" has fewest samples:     0
>                                Total unlabelled words:       27
>                                Final labelled words:         18
> Generating training data
> TRAINING ... Font name = UnknownFont.
> Generated training data for 18 blobs
>
>
>
>
> I have studied the document it says
>
>           " If there are FATALITIES reported, then there is no point
> continuing with the training process until you fix the box file. A
> FATALITY usually indicates that this step failed to find any training
> samples of one of the characters listed in your box file. Either the
> coordinates are wrong, or there is something wrong with the image of
> the character concerned. If there is no workable sample of a
> character, it can't be recognized, and the generated inttemp file
> won't match the unicharset file later and Tesseract will abort. "
>
>
> I just wanted to know where it is possible to train tesseract for
> segmented/broken digits ?
>
>
>
>
>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to