Hi Rob,

> As I only have to recognise 2,3,...,9,10,J,Q,K,A a small subset of the English
> language file, would it be much more accurate (i.e. worthwhile) to retrain 
> just
> on these characters or to simply use remain using the English language file.

Presuming the characters you want to recognise are not too far from
'normal' english characters, don't bother retraining.

The most important thing for you would be to whitelist only the
characters that can occur. Follow this guide:
http://code.google.com/p/tesseract-ocr/wiki/FAQ#How_do_I_recognize_only_digits?
so if you're using Tesseract 3, you'll need to copy the 'digits'
config file (tessdata/configs/digits, e.g. to
tessdata/configs/cards), add the extra characters
(e.g. JQKA) and call tesseract something like this:
tesseract imagename outputbase cards

Fun sounding project. Let us know how it goes!

Nick

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to