To let you know, can't see images yet...
On Thu, Apr 7, 2011 at 8:17 AM, Amrit <[email protected]> wrote: > Hi Dmitri/Partik, > Thanks for your reply.I am sending along the pre processed test image which > I had mentioned in my response. > tesseract output - SOUTHBURY~ CT DLUBB > > Regards, > Amrit. > > On Wed, Apr 6, 2011 at 12:05 AM, Dmitri Silaev <[email protected]> > wrote: >> >> Agree not to use dictionary at all. IMO the best you can do is: >> - use appropriate whitelists for each character position >> - obtain a set of char choices for every char position >> - restrict choice sets by using other semantic information you may have >> >> Warm regards, >> Dmitri Silaev >> >> >> >> >> >> On Wed, Apr 6, 2011 at 6:00 AM, Amrit <[email protected]> wrote: >> > Hi All, >> > I am trying to evaluate tesseract to decode US postal address >> > from a set of images(english text with varying font).I want to extract >> > the city,state zipcode combination from the image.In doing so, out of >> > the box tesseract 3.01 performance is average and I would like to >> > increase the accuracy of the system by providing a custom grammar/ >> > wordlist (language model). >> > Any idea as to how to accomplish this?(My custom grammar/ >> > language model will only contain City,State and ZipCode numbers). >> > >> > I have tried to create custom dawg by following on the lines of >> > 'training tesseract 3' wiki page, but this doesn't seem to work at >> > all.Is there any way I can do this without training a subset of my >> > test images? >> > >> > Regards, >> > Amrit. >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups "tesseract-ocr" group. >> > To post to this group, send email to [email protected]. >> > To unsubscribe from this group, send email to >> > [email protected]. >> > For more options, visit this group at >> > http://groups.google.com/group/tesseract-ocr?hl=en. >> > >> > > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

