Re: character-wise confidence values

eva charles Tue, 19 Jun 2012 23:29:25 -0700

I have already looked at that code. It is also giving me the same issue
(i.e. equal confidence values for characters belonging to a word despite
dissimilar fonts of each character).


On Tue, Jun 19, 2012 at 8:07 PM, Sven Pedersen <[email protected]>wrote:

> Search the archives, use Google. Here's a link from someone's solution
> http://code.google.com/p/tesseract-ocr/issues/detail?id=714#c1
>
> On Tue, Jun 19, 2012 at 4:52 AM, eva charles <[email protected]> wrote:
> > Hi...i executed the following code to generate character-wise confidence
> > values:
> >
> > int main(int argc, char **argv) {
> >
> >         const char *lang="eng";
> >         const PIX   *pixs;
> >          if ((pixs = pixRead(argv[1])) == NULL) {
> >            cout <<"Unsupported image type"<<endl;
> >             exit(3);
> >           }
> >         TessBaseAPI  api;
> >         api.SetVariable("save_blob_choices", "T");
> >     api.SetPageSegMode(tesseract::PSM_SINGLE_WORD  );
> >     api.SetImage(pixs);
> >         int rc = api.Init(argv[0], lang);
> >         api.Recognize(NULL);
> >         ResultIterator* ri = api.GetIterator();
> >         if(ri != 0)
> >         {
> >             do
> >             {
> >                 const char* symbol = ri->GetUTF8Text(RIL_SYMBOL);
> >                 if(symbol != 0)
> >                 {
> >                     float conf = ri->Confidence(RIL_SYMBOL);
> >                     cout<<"\nnext symbol: "<< symbol << " confidence: "
> <<
> > conf <<"\n" <<endl;
> >
> >                  }
> >
> >
> >                 delete[] symbol;
> >                     }    while((ri->Next(RIL_SYMBOL)));
> >         }
> >         return 0;
> > }
> >
> > the output obtained for the attached image was:
> > next symbol: N confidence: 72.3563
> > next symbol: B confidence: 72.3563
> >
> > next symbol: E confidence: 69.9937
> > next symbol: T confidence: 69.9937
> > next symbol: R confidence: 69.9937
> > next symbol: A confidence: 69.9937
> > next symbol: N confidence: 69.9937
> > next symbol: G confidence: 69.9937
> > next symbol: - confidence: 69.9937
> > next symbol: I confidence: 69.9937
> >
> > As is evident, the confidence values for characters belonging to the same
> > word is the same.
> > Is this the expected output? Shouldn't the confidence values be different
> > for each character?
> > I tried executing the code for a word in which each character was in
> > different font style..and yet, the confidence value was the same for
> > characters belonging to the same word.
> > PLease help me out..
> >
> > Thanks in Advance
> > Uni.
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "tesseract-ocr" group.
> > To post to this group, send email to [email protected]
> > To unsubscribe from this group, send email to
> > [email protected]
> > For more options, visit this group at
> > http://groups.google.com/group/tesseract-ocr?hl=en
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: character-wise confidence values

Reply via email to