Thank you On Wed, Oct 6, 2010 at 3:26 AM, Jimmy O'Regan <[email protected]> wrote: > On 5 October 2010 23:43, haratron <[email protected]> wrote: >> I'm using tesseract 3.00 with hOCR output and I get the xocr_word >> among other things. >> Example: >> <span class='xocr_word' id='xword_1_5' title="x_wconf -4">testing</span> >> >> The x_wconf attribute is for certainty of the result. Which is >> calculated through a certainty() function, from what I saw in >> tesseract's source. >> The problem is that I can't find the function's definition anywhere. >> How does it work? What are the boundaries (lower and upper limit) of >> the certainty() return value? > > There is no single 'certainty' function that calculates certainty. The > certainty() member of the WERD_CHOICE class is an accessor method; > multiple functions may be involved in calculating an overall certainty > for a particular word: TessBaseAPI::AllWordConfidences() will give you > an array of candidates, but through the hOCR output, you'll only get > the one that was finally selected. The fragment that function uses to > convert the value to one between 0 and 100 is: > int w_conf = static_cast<int>(100 + 5 * choice->certainty()); > // This is the eq for converting Tesseract confidence to > 1..100 > if (w_conf < 0) w_conf = 0; > if (w_conf > 100) w_conf = 100; > > > -- > <Leftmost> jimregan, that's because deep inside you, you are evil. > <Leftmost> Also not-so-deep inside you. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > >
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

