Hello, I ran into an assertion failure when I run tesseract on a scanned image. The output that I get is:
Page 1 Detected 146 diacritics split_pt >0 && split_pt < word->chopped_word->NumBlobs():Error:Assert failed:in file ..\..\ccmain\tfacepp.cpp, line 186 I am testing on windows and the tesseract version is: tesseract 3.04.02dev leptonica-1.71 (Oct 21 2016, 18:04:17) [MSC v.1800 DLL Release x86] libgif 4.1.6(?) : libjpeg 8c : libpng 1.4.3 : libtiff 3.9.4 : zlib 1.2.8 The image is a tif file, black & white, compressed with CCITT Group 4 Fax Encoding, resolution is 300 dpi, 2461 x 3478 pixels and the file size is 112K. Unfortunately I cannot attach the file. The image is a form that includes some hand-written areas. If I redact the image with black boxes on top of every hand-written area, then tesseract is able to process the file without crashing. So my first thought was that the problem is in recognition of hand-writting. However, I also tried resizing the original (unredacted) image to 1920 x 2714 and it also worked. So it seems like the hand-writting is no longer a problem, when the image is slightly smaller. I am trying to use tesseract on an automated system that processes scanned images. Any ideas on how to resolve this? Thank you very much George -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/84d56d5f-288e-45c2-a30a-5c573588b9d0%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

