The ResultIterator has a method IsAtBeginningOf which can be used for this.

Try putting this code just at the start of the loop:

    if (ri->IsAtBeginningOf(tesseract::RIL_WORD)) { printf("======= Start 
word\n");}

John

On Thursday, 8 October 2015 10:33:23 UTC+1, RK wrote:
>
> Hi,
>
> I have this code which gives me confidence level of each character in a 
> word. Now if I have multiple words in a image it gives me confidence level 
> of each character. 
>
> But  the problem is it prints all in a sequence (It is not taking empty 
> space ) how do I identify that a word ends here and the other word's 
> character probability starts. Any help on this ??? 
>
>
> For example: If my image has Book Now (refer attached image) My current 
> output is as follows. Now I want to introduce a delimiter after symbol "K" 
> then how should I do that. So that I will know that till Symbol K it is one 
> word and after that it is another word.
>
> sample output:
>
> symbol B, conf: 95.727936 - B conf: 95.727936
> - 3 conf: 83.558624
> - E conf: 81.664284
> ---------------------------------------------
> symbol O, conf: 90.067154 - O conf: 90.067154
> - 0 conf: 87.427773
> - Q conf: 83.844460
> - C conf: 82.962616
> - G conf: 79.682472
> ---------------------------------------------
> symbol O, conf: 90.468826 - O conf: 90.468826
> - 0 conf: 87.815132
> - C conf: 86.248314
> - Q conf: 82.877472
> ---------------------------------------------
> symbol K, conf: 93.121216 - K conf: 93.121216
> ---------------------------------------------
> symbol N, conf: 91.598183 - N conf: 91.598183
> ---------------------------------------------
> symbol O, conf: 89.931847 - O conf: 89.931847
> - 0 conf: 87.237823
> - Q conf: 84.576927
> - C conf: 82.600273
> - G conf: 80.553169
> - D conf: 79.337044
> ---------------------------------------------
> symbol W, conf: 96.001007 - W conf: 96.001007
> - w conf: 86.990593
> ---------------------------------------------
>
>
> #include <tesseract/baseapi.h>
> #include <leptonica/allheaders.h>
> #include </usr/local/include/tesseract/pageiterator.h>
> #include </usr/local/include/tesseract/resultiterator.h>
> #include <iostream>
> int main()  
> {
> Pix *image = pixRead("sample.png");
>   tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
>   api->Init(NULL, "eng");
>   api->SetImage(image);
>   api->SetVariable("save_blob_choices", "T");
>   api->SetRectangle(37, 228, 548, 31);
>   api->Recognize(NULL);
>
>   tesseract::ResultIterator* ri = api->GetIterator();
>   tesseract::PageIteratorLevel level = tesseract::RIL_SYMBOL;
>   if(ri != 0) {
>       do {
>           const char* symbol = ri->GetUTF8Text(level);
>           float conf = ri->Confidence(level);
>           if(symbol != 0) {
>               printf("symbol %s, conf: %f", symbol, conf);
>               bool indent = false;
>               tesseract::ChoiceIterator ci(*ri);
>               do {
>                   if (indent) printf("\t\t ");
>                   printf("\t- ");
>                   const char* choice = ci.GetUTF8Text();
>                   printf("%s conf: %f\n", choice, ci.Confidence());
>                   indent = true;
>               } while(ci.Next());
>           }
>           printf("$\n");
>           delete[] symbol;
>       } while((ri->Next(level)));
>   }
>
>
> }
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ac6dad1a-ee41-447f-9849-ff95c530f9ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to