Using Grammar to Improve Image Decoding Accuracy

Amrit Tue, 05 Apr 2011 19:49:48 -0700

Hi All,
        I am trying to evaluate tesseract to decode US postal address
from a set of images(english text with varying font).I want to extract
the city,state zipcode combination from the image.In doing so, out of
the box tesseract 3.01 performance is average and I would like to
increase the accuracy of the system by providing a custom grammar/
wordlist (language model).
       Any idea as to how to accomplish this?(My custom grammar/
language model will only contain City,State and ZipCode numbers).


I have tried to create custom dawg by following on the lines of
'training tesseract 3' wiki page, but this doesn't seem to work at
all.Is there any way I can do this without training a subset of my
test images?

Regards,
Amrit.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Using Grammar to Improve Image Decoding Accuracy

Reply via email to