Hello! I have been spending a couple of days getting familiar with Tesseract but I am finding that the more I learn the more I realize how much there is to this so I am posting this in hopes that more experienced users/devs can point me in the right direction so I don't spend days barking up the wrong tree unnecessarily.
What I want to do: *OCR exactly one word with no spaces from is a screenshot. The good: *I know exactly where this word will appear so I can feed a bitmap to Tesseract with the word centered and as much free space around it as desired. Of course there is no skewing or sloping but just plain,straight text. *The word is actually not a real word but just 1-5 random capital letters used to ID goods containers. The format is therefore known. *I can easily create a dictionary of all acceptable "words" in this "language". (I have an excel file of all container IDs) *The font is simple and is just "even lines" with no serifs or fancy stuff. However it is not mono spaced. The bad: *The font is small. Only about 8px in height. What I have done so far: *Using the Leptonica utility provided with Capture2Text <http://capture2text.sourceforge.net/> I have pre-processed the BMP using pretty much the default values that Capture2Text uses (scaling it 3.5x), inverting the colors and make black and white. (original is white on dark grey) *I have created a conf file with only english caps to use as whitelist. *I throw the resulting TIF to Tesseract using the whitelist and -psm 8 (single word). *I have NOT yet applied a dictionary since I want to try out the other parameters first to optimize them and then put in the dictionary last. *I don't specify a language so I think tesseract is usung default (english) With these steps I get a pretty good but not perfect result. Especially interesting to me is that leptonica seem to handle the preprocessing diferently depending on if the original BMP is shifted a pixel left or right even though there is plenty of space around the word which seems strange to me and that does generate inconstancy. So, could anyone with experience or thoughts on screen captured OCR comment on this and send me off in the right direction to further optimize this? *Training Tesseract, somehow? *Should I use ImageMagick instead of Leptonica for more consistent results? Use different parameters/functions with Leptonica? Recommendations? *Other things to consider? *I have found that most information applies primarily to people scanning tons of documents and since this application is somewhat different with different problems I want to ask if someone in here has done something similar and how they got it to work. FYI: The input comes from another program running on a computer and I have no way of accessing the text programatically or copy it to clipboard or similar so let's focus on a solution with OCR. I can't control how it is rendered on screen either. Thank you for your time! /Kris -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/dc501958-c463-4b71-b342-b0d5f1c6be8c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

