[tesseract-ocr] Tesseract missing quite obvious word

Mark Zealey Thu, 09 Oct 2014 12:05:11 -0700

I've got a weird issue where tesseract is recognising 99% of all of my 
files fine but in this case it is dropping something that is obviously a 
word - in the first line here 'RAGMEN'. It looks to me like the 
segmentation is failing as running with scrollview shows the word in grey, 
the words around it are also usually shown with joined blobs - can anyone 
recommend a config option to fix this, I've tried messing with all manner 
of different config options but can't seem to make any difference. The file 
is here:


http://mark.zealey.org/tessmissingword.jpg

Thanks!

Mark

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ebc3b5e2-6284-444d-8822-d68b49e3d953%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Tesseract missing quite obvious word

Reply via email to