Hi, Thanks for the awesome opensource OCR application.
I can generate html and box files using a config file like this: tessedit_char_whitelist abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ tessedit_create_boxfile 1 tessedit_create_hocr 1 This does not seem to be producing confidence values, either by word or letter. The box file looks like this: a 1883 3619 1940 3684 0 d 1946 3617 2007 3704 0 e 2014 3618 2069 3684 0 And the <body> of the html hocr file looks identical: a 1883 3619 1940 3684 0 d 1946 3617 2007 3704 0 e 2014 3618 2069 3684 0 Is there a variable I can set in the config file to produce confidence values for words or letters? I am using: tesseract 3.02.02 leptonica-1.69 libjpeg 8d : libpng 1.5.14 : libtiff 4.0.3 : zlib 1.2.5 ... compiled on a Mac, OS X 10.8.3 Works great. Many thanks - Perry -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

