I have already looked at that code. It is also giving me the same issue (i.e. equal confidence values for characters belonging to a word despite dissimilar fonts of each character).
On Tue, Jun 19, 2012 at 8:07 PM, Sven Pedersen <[email protected]>wrote: > Search the archives, use Google. Here's a link from someone's solution > http://code.google.com/p/tesseract-ocr/issues/detail?id=714#c1 > > On Tue, Jun 19, 2012 at 4:52 AM, eva charles <[email protected]> wrote: > > Hi...i executed the following code to generate character-wise confidence > > values: > > > > int main(int argc, char **argv) { > > > > const char *lang="eng"; > > const PIX *pixs; > > if ((pixs = pixRead(argv[1])) == NULL) { > > cout <<"Unsupported image type"<<endl; > > exit(3); > > } > > TessBaseAPI api; > > api.SetVariable("save_blob_choices", "T"); > > api.SetPageSegMode(tesseract::PSM_SINGLE_WORD ); > > api.SetImage(pixs); > > int rc = api.Init(argv[0], lang); > > api.Recognize(NULL); > > ResultIterator* ri = api.GetIterator(); > > if(ri != 0) > > { > > do > > { > > const char* symbol = ri->GetUTF8Text(RIL_SYMBOL); > > if(symbol != 0) > > { > > float conf = ri->Confidence(RIL_SYMBOL); > > cout<<"\nnext symbol: "<< symbol << " confidence: " > << > > conf <<"\n" <<endl; > > > > } > > > > > > delete[] symbol; > > } while((ri->Next(RIL_SYMBOL))); > > } > > return 0; > > } > > > > the output obtained for the attached image was: > > next symbol: N confidence: 72.3563 > > next symbol: B confidence: 72.3563 > > > > next symbol: E confidence: 69.9937 > > next symbol: T confidence: 69.9937 > > next symbol: R confidence: 69.9937 > > next symbol: A confidence: 69.9937 > > next symbol: N confidence: 69.9937 > > next symbol: G confidence: 69.9937 > > next symbol: - confidence: 69.9937 > > next symbol: I confidence: 69.9937 > > > > As is evident, the confidence values for characters belonging to the same > > word is the same. > > Is this the expected output? Shouldn't the confidence values be different > > for each character? > > I tried executing the code for a word in which each character was in > > different font style..and yet, the confidence value was the same for > > characters belonging to the same word. > > PLease help me out.. > > > > Thanks in Advance > > Uni. > > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to [email protected] > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > > http://groups.google.com/group/tesseract-ocr?hl=en > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

