[tesseract-ocr] Assertion failure

George Papadopoulos Tue, 15 Nov 2016 02:45:06 -0800

Hello,

I ran into an assertion failure when I run tesseract on a scanned image. 
The output that I get is:

Page 1
Detected 146 diacritics
split_pt >0 && split_pt < word->chopped_word->NumBlobs():Error:Assert
failed:in file ..\..\ccmain\tfacepp.cpp, line 186

I am testing on windows and the tesseract version is:

tesseract 3.04.02dev
leptonica-1.71 (Oct 21 2016, 18:04:17) [MSC v.1800 DLL Release x86]
libgif 4.1.6(?) : libjpeg 8c : libpng 1.4.3 : libtiff 3.9.4 : zlib 1.2.8

The image is a tif file, black & white, compressed with CCITT Group 4 Fax
Encoding, resolution is 300 dpi, 2461 x 3478 pixels and the file size is
112K. Unfortunately I cannot attach the file.

The image is a form that includes some hand-written areas. If I redact the
image with black boxes on top of every hand-written area, then tesseract is
able to process the file without crashing. So my first thought was that the
problem is in recognition of hand-writting.

However, I also tried resizing the original (unredacted) image to 1920 x
2714 and it also worked. So it seems like the hand-writting is no longer a
problem, when the image is slightly smaller.

I am trying to use tesseract on an automated system that processes scanned
images. Any ideas on how to resolve this?

Thank you very much

George

--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/84d56d5f-288e-45c2-a30a-5c573588b9d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Assertion failure

Reply via email to