Hi Roger, Tesseract has loads of options, but they're mostly specified as arguments after the -c flag. A lot of the options you can set with that are more for debugging or development, though, so best is just to check if what you want to do is mentioned in the wiki (in this case it is at [0] and [1]).
> "-C string > only recognise characters from string, this is a filter function in cases > where > the interest is only to a part of the character alphabet, you can use 0-9 or > a-z to specify ranges, use -- to detect the minus sign" You can do this with tesseract with a command like this: tesseract image output -c tessedit_char_whitelist string > "-l level > set grey level to level (0<160<=255, default: 0 for autodetect), darker pixels > belong to characters, brighter pixels are interpreted as background of the > input image" Tesseract essentially does "autodetect" always here. If it gets the binarisation (converting from grey to black & white) wrong, you need to preprocess the image until it works ;) Hope that helps, Nick 0. https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality#Dictionaries,_word_lists,_and_patterns 1. https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_do_I_recognize_only_digits? -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

