Hi, I am trying to find a way on how to define my own words list for tesseract. I want to use only my defined words and guess the most likely one.
So I have a small image with a single word in it. I process it with this command in order to get pure "black on white" type of image: $ convert -colorspace gray -auto-level -threshold 60% -type bilevel -depth 8 *image.png newimage.png* Then I try to extract a single word from that image: $ tesseract *newimage.png* -psm 8 stdout and it returns a single word (which is great), but slightly incorrect: *Expectation*: nieko *Result*: flieko I've just spent like 5+ hours trying to find any documentation or tutorial on how to set a whitelist dictionary for words recognition. Any tips on that? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f0147f58-f586-4346-bb35-366d571bf0ef%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

