Hi,

Thanks for the awesome opensource OCR application.

I can generate html and box files using a config file like this:

tessedit_char_whitelist abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
tessedit_create_boxfile 1
tessedit_create_hocr 1

This does not seem to be producing confidence values, either by word or 
letter.

The box file looks like this:

a 1883 3619 1940 3684 0
d 1946 3617 2007 3704 0
e 2014 3618 2069 3684 0

And the <body> of the html hocr file looks identical:

a 1883 3619 1940 3684 0
d 1946 3617 2007 3704 0
e 2014 3618 2069 3684 0

Is there a variable I can set in the config file to produce confidence 
values for words or letters?

I am using:
tesseract 3.02.02
 leptonica-1.69
  libjpeg 8d : libpng 1.5.14 : libtiff 4.0.3 : zlib 1.2.5

... compiled on a Mac, OS X 10.8.3  Works great.

Many thanks -

Perry

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to