Hello! I'm trying to find out why Tesseract is rejecting certain blobs from the image here. The text "nestle" and "nesquik" have overlapping baselines. I suspect the overlap might be causing it to stop recognizing anything at all.
<https://lh3.googleusercontent.com/-UPrCftV13b8/VnhbBdgahkI/AAAAAAAAQCY/QGkIm0Nn3ew/s1600/nesquik-small.png> I've also included some debug images I captured from tesseract. From the images, it seems like Tesseract can correctly identify the blobs for each character and finds the correct baselines. However, it chops the Nesquik from the image based on the "nestle". <https://lh3.googleusercontent.com/-8Kb86BrTxWo/VnhbVsWTCzI/AAAAAAAAQCg/y-YS4sLeOco/s1600/nesquik1.png> <https://lh3.googleusercontent.com/-8BIjiSuo5zg/VnhbXhA9uZI/AAAAAAAAQCo/6Gqa2qkKrRc/s1600/nesquik2.png> Are there any tesseract config options I can use to get this working? Thanks! -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0bc0a880-68b0-4db4-9c5f-0106b79ef04a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

