Hi all,
  I have generated tif file and box file. Trying to train tesseract
with this line:

on the output I get:

Tesseract Open Source OCR Engine with LEptonica
Apply_Boxes:
        Boxes read from boxfile:  572
        Initially labeled blobs 772 in 8 rows
        Box failures detected: 0
        Duped blobs for rebalance: 0
        "9" has fewest samples: 21
                Total unlabeled wors: 0
                Final labelled words: 572
Generating training data
TRAINING ... Font name = microsoftsansserif

And after last line i get exception from tesseract.exe  ( VS
debugger ) :
Unhandled exception at 0x7c910a19 in tesseract.exe: 0xC0000005: Access
violation reading location 0x00000000.

and debugger points to line:
            res = _heap_alloc(size);
in malloc.c file

Why is it so? I cant get any proper info about this. Is it related
with an image im using to train?
I have written digits, . and : in wordpad, made a printscreen and
saved file to tiff then resized it 3x with infraview app.

Plaese help

ps. how can I train tesseract so it could recognise small fonts. While
tryining  to train on images with small digits i get errors related
with overlaping blobs ...



-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to