lol thanks Albert, now I know :)

Thanks dythmall, I'd thought that might be the case. I did some tests
and found that by selecting a specific area that I know will contain a
certain number of characters, I can apply my own adaptive threshold
based on the density of black pixels I'd expect. So far it's increased
the accuracy quite a bit! Next I'm planning on training tesseract
based on the black and white images my threshold creates rather than
the actual font being used. Hopefully if I train it on more realistic
data it will be even more accurate.

I've been trying to think of ways to remove the background, but it
needs to be automated. If I had a copy of the background image without
the text on, I could combine them using a difference filter and hey
presto the text would pop out on its own. Thanks again for the reply!
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to