Hi all,
I have generated tif file and box file. Trying to train tesseract
with this line:
on the output I get:
Tesseract Open Source OCR Engine with LEptonica
Apply_Boxes:
Boxes read from boxfile: 572
Initially labeled blobs 772 in 8 rows
Box failures detected: 0
Duped blobs for rebalance: 0
"9" has fewest samples: 21
Total unlabeled wors: 0
Final labelled words: 572
Generating training data
TRAINING ... Font name = microsoftsansserif
And after last line i get exception from tesseract.exe ( VS
debugger ) :
Unhandled exception at 0x7c910a19 in tesseract.exe: 0xC0000005: Access
violation reading location 0x00000000.
and debugger points to line:
res = _heap_alloc(size);
in malloc.c file
Why is it so? I cant get any proper info about this. Is it related
with an image im using to train?
I have written digits, . and : in wordpad, made a printscreen and
saved file to tiff then resized it 3x with infraview app.
Plaese help
ps. how can I train tesseract so it could recognise small fonts. While
tryining to train on images with small digits i get errors related
with overlaping blobs ...
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en