I try to use tesseract 3.03 to OCR scanned pages. In many cases 1 scan job contains many jobs and they are separated by feeding a special spearator page between the jobs to separate them. This page contains only 12 "T" on the left top of the page (and a second line head down at the right bottom).
I tried a lot, but it seems that tesseract completely ignores this text, even the scan looks great. That page is completely empty! The rest of the OCRed text looks also good. The idea is not mine, but i have to use this kind of separation. Is there something i can do to improve recognition of this sepcial text ? Nicolas Nickisch -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/87b36e9f-2297-4b65-8421-7b5eb307ffcd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

