Training by providing a text file accompanying an image?

Philipp Lenssen Thu, 20 Nov 2008 04:30:05 -0800

Hi! I read through (http://code.google.com/p/tesseract-ocr/wiki/
TrainingTesseract) but wanted to see if there's an easier option than
creating specific bounding boxes for each letter (which is what I
understand the tutorial says one needs to do?). Is there any option
where one would simply point to a TIF and TXT file, the TXT file
containing the correct text, and thus train Tesseract accordingly?


For instance, I'm currently getting a result like this one on an
image:
------------
Aprll 15 1953
Foober
------------

So I would like to change the text to
------------
April 15 1953
Foobar
------------
... for training purposes (guessing that Tesseract could take a try at
figuring out the bounding boxes itself as it did for the first
incorrect run?).

Thanks!
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Training by providing a text file accompanying an image?

Reply via email to