I have an image containing English words with their phonetic
equivalents printed alongside (a pronunciation guide.) It's actually
for a spelling bee. The organizers of the bee would very much like to
get this image as a textual list (so they can manipulate it, pull
words from it randomly, and so on) and they came to me for help. I've
used Tesseract before, but only to recognize English and French text.

The pronunciation guide contains text with lots of umlauts, upside
down letter e's, etc. More specialized characters than are contained
in French (which is one thing I tried, in an effort to improve the
recognition.) Does anyone have any advice for recognizing characters
like this? Should I start training Tesseract to recognize them? I've
never done any training before, which is why I'm a bit reluctant.

Thanks in advance!

Terrence

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to