I'm working on a project where my source tiff image may have
background colours or images behind the text.

I've been able to train tesseract successfully with some other fonts,
which works very well, but the background does seem to confuse
tesseract a little.

My question is, does tesseract perform any image pre-processing? If
not, is it worth me trying to apply a threshold or some other type of
optimization to the image first?

I've had a brief look through the source code, but I'm not really a C+
+ developer so it was a bit hard to follow. What I'm trying to achieve
is something like reading text from a magazine where it's all printed
on top of a background image.

I'm trying to find out what sort of image tesseract is actually
working on, as perhaps I could then train it with a more accurate
representation of the text it needs to recognise.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to