TesserractExtractResult() returns the confidence numbers for all
characters returned. A high number means low confidence. Caveats:
1. The confidence numbers are the same for all letters in a word (even
though Tesseract does compute confidence numbers for each letter, it
just doesn't return them to the API)
2. From personal experience, these numbers are not very reliable and
we decided not to use them - but feel free to test yourself, we gave
up fairly quickly.

Patrick

On Jul 9, 5:01 am, caro <[email protected]> wrote:
> I am working with tesseract OCR and I would like to get at the end of
> the algorithm a confidence value which may express if the recognition
> seems OK or not really.
>
> For example, I have an image with the text: TEST RESULTS ARE OK.
> Depending on a threshold value, I can get different output of the OCR:
>  - TEST RESSUTTS AKE OC
>  - TEST TELLUTTS ARE OB
> ....
> The best threshold can be different for different images.
> So if I can get this confidence value, maybe it can give me the best
> theshold to choose for the OCR?
>
> Thank you for your help,
> Caroline

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to