On Mon, Apr 9, 2012 at 9:47 PM, David Eger <[email protected]> wrote:
> The ability to use user patterns was added by Tesseract 3.01, and now has > a little documentation. See the comment in dict/trie.h: > > > http://code.google.com/p/tesseract-ocr/source/browse/tags/release-3.01/dict/trie.h > > And the newly updated man page here: > > > http://code.google.com/p/tesseract-ocr/source/browse/trunk/doc/tesseract.1.asc > > I think the user pattern you want is: > > \A\A\d\d\d\A\A > > -David > > David, I just followed instructions[1] and I run "tesseract eurotext.tif eurotext bazaar" with this result: Please provide at least 4 concrete characters at the beginning of the pattern Invalid user pattern 1-\d\d\d-GOOG-411 Tesseract Open Source OCR Engine v3.02 with Leptonica Segmentation fault It looks like example is not correct... So I just used only second line. Then there is no error message but Segmentation fault is present. It looks like tesseract crash on classify/normmatch.cpp:118 (Proto = (PROTOTYPE *) first_node (Protos); )... It would be great to have example in man page that works with example/testing images... [1] http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html#_config_files_and_augmenting_with_user_data -- Zdenko -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

