Hello everyone. I am new to tesseract and need your help on detecting a set of defined words from set of images. There are known number of possibilities(son, daughter, head, wife, etc) which makes the problem easier but the variation of writing same words has been difficult to detect proper words.
Based on what I read, there are one known way and one experiment I want to get your feedback. 1. Define box file using individual letters. This is a known path that is reasonably well documented in the forum. I probably have to create one font for each distinguishable writing style. Have anyone tried to detect similar writing style as attached sample? anything I should be aware of? 2. Train a new language and use the entire word such as "son" as a character. There will be X number of characters only and X is number of all known possibilities I mentioned earlier. I am less optimistic about this one because of the variation of width and height even in a same writing style but curious to hear from you. I would appreciate any feedback. Thanks, Sol -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
<<attachment: sample.jpg>>

