On Tuesday, December 22, 2015 at 2:04:26 AM UTC-5, Utkarsh Sinha wrote:
>
> I'm trying to find out why Tesseract is rejecting certain blobs from the 
> image here. The text "nestle" and "nesquik" have overlapping baselines. I 
> suspect the overlap might be causing it to stop recognizing anything at all.
>

They're not only overlapping, but they are at something like a 30 degree 
angle to each other.  It doesn't surprise me that Tess considers that an 
unreasonable amount of interline skew.  Where would one see that in a 
normal text layout? Additionally, the "Nesquick" isn't really text, but a 
stylized logotype.

Perhaps consider using SIFT/SURF/etc detectors from OpenCV?

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/22650f25-4431-4acb-a10a-fd447a4a9574%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to