I've got a weird issue where tesseract is recognising 99% of all of my files fine but in this case it is dropping something that is obviously a word - in the first line here 'RAGMEN'. It looks to me like the segmentation is failing as running with scrollview shows the word in grey, the words around it are also usually shown with joined blobs - can anyone recommend a config option to fix this, I've tried messing with all manner of different config options but can't seem to make any difference. The file is here:
http://mark.zealey.org/tessmissingword.jpg Thanks! Mark -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ebc3b5e2-6284-444d-8822-d68b49e3d953%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

