Hello,

I ran into an assertion failure when I run tesseract on a scanned image. 
The output that I get is:

Page 1
Detected 146 diacritics
split_pt >0 && split_pt < word->chopped_word->NumBlobs():Error:Assert 
failed:in file ..\..\ccmain\tfacepp.cpp, line 186

I am testing on windows and the tesseract version is:

tesseract 3.04.02dev
 leptonica-1.71 (Oct 21 2016, 18:04:17) [MSC v.1800 DLL Release x86]
  libgif 4.1.6(?) : libjpeg 8c : libpng 1.4.3 : libtiff 3.9.4 : zlib 1.2.8

The image is a tif file, black & white, compressed with CCITT  Group 4 Fax 
Encoding, resolution is 300 dpi, 2461 x 3478 pixels and the file size is 
112K. Unfortunately I cannot attach the file.

The image is a form that includes some hand-written areas. If I redact the 
image with black boxes on top of every hand-written area, then tesseract is 
able to process the file without crashing. So my first thought was that the 
problem is in recognition of hand-writting.

However, I also tried resizing the original (unredacted) image to 1920 x 
2714 and it also worked. So it seems like the hand-writting is no longer a 
problem, when the image is slightly smaller.

I am trying to use tesseract on an automated system that processes scanned 
images. Any ideas on how to resolve this? 

Thank you very much

George

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/84d56d5f-288e-45c2-a30a-5c573588b9d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to