Re: Some Clue on Generating Probablity scores for each character/word

merve Thu, 12 Jul 2012 04:12:02 -0700

Hello
do you input to tesseract adjacent word, i mean do you input letters which 
have concrete connection between them or discrete letters, independent 
blobs?
I think the "converter" experience is excellent.
Acoording to your reply i am going to decide to use opencv to chop the 
letters or input adjacent letter blobs-words to tesseract.,
Thanks in advance.


On Wednesday, October 3, 2007 10:08:54 AM UTC+3, Basu wrote:
>
> Thanks Glen for your suggestions..
> I have backtracked lot of programs and finally got the desired result
> few minutes back..
> in dict\choices.cpp print_choices function does this task of printing
> probability and certainty values for each character (blob) in the
> input image
> One can invoke this from wordrec\wordclass.cpp classify_blob function
> there is a piece of code like this at the bottom of the function.
>
> #ifndef GRAPHICS_DISABLED
>   if (display_ratings && string)
>     print_choices(string, rating);
>
>   if (blob_pause)
>     window_wait(blob_window);
> #endif
>
> So you can use this function function print_choices(string, rating);
> and get a probabilistic output.
> Please Note: I have only used tprinf for printing output to my log
> file
> A log file listing (along with my backtracking listings) is give below
> for reference.
> I gave a simple bmp file as input with a handwritten text basu written
> in it.
> One can easily customize the output from here I suppose.
>
> ===================================================================
> Tesseract Open Source OCR Engine
> tessedit_serial_unlv=0
> I am Here in 3rd Else
> I am Here in baseapi.cpp
> I am Here in control.cpp
> control.cpp->classify_word_pass1
> tessbox.cpp->tess_segment_pass1
> tfacepp.cpp->recog_word
> tfacepp.cpp->recog_word_recursive
> tface.cpp->cc_recog
> chopper.cpp->chop_word_main:
> chop_word:    |->  b  <-         53.66     -7.01| t          61.22
> -8.00
> chop_word:    |->  a  <-         40.80     -8.00
> chop_word:    |->  s  <-         27.28     -8.46| o          30.09
> -9.33| z          30.40     -9.43| e          36.22    -11.23
> chop_word:    |->  u  <-         46.97     -9.08| w          51.38
> -9.93| a          56.72    -10.96
> OUTPUT:Best-Choice:basu
> OUTPUT:Blob Choices:4
> tfacepp.cpp->recog_word
> tfacepp.cpp->recog_word_recursive
> tface.cpp->cc_recog
> chopper.cpp->chop_word_main:
> chop_word:    |->  b  <-         53.66     -7.01| t          61.22
> -8.00
> chop_word:    |->  a  <-         40.80     -8.00
> chop_word:    |->  s  <-         27.28     -8.46| o          30.09
> -9.33| z          30.40     -9.43| e          36.22    -11.23
> chop_word:    |->  u  <-         46.97     -9.08| w          51.38
> -9.93| a          56.72    -10.96
> basu
> ==========================================================================
>
> Hope this will be helpful info for somebody like me..working with
> handwritten data and want some kind of  probability scores for each
> character.
> Thanks all,
> Basu
>
>
> On Sep 28, 9:21 pm, "[email protected]" <[email protected]> wrote:
> > The dll return a confidence value for the word. There is a define that
> > the dll turns on to do this. You may trace this value backwards to
> > find what you are looking for.
> >
> > On Sep 28, 8:14 am, "[email protected]" <[email protected]> wrote:
> >
> >
> >
> > > The dll return a confidence value for the word. There is a define that
> > > the dll turns on to do this. You may trace this value backwards to
> > > find what you are looking for.
> >
> > > On Sep 28, 5:51 am, Basu <[email protected]> wrote:
> >
> > > > Hi,
> >
> > > > I am trying hard on generating some probability scores for each
> > > > character in a word.
> > > > As I am working with handwritten words, it may be useful information.
> >
> > > > e.g.,If I give input as "hello" as a handwritten bmp/tif word image
> > > > Now I am getting output say, "nollo". (fixed..no alternative
> > > > suggestions)
> > > > I want to generate an output inthe following probabilistic form 
> (shown
> > > > vertically for each character in word):
> > > > a(0.01)..b(0.02)....n(0.6) ..    ..z(0.01)  --> recognized as 'n'
> > > > a(0.01)..b(0.02)....o(0.7)   ..  ..z(0.01)  --> recognized as 'o'
> > > > a(0.01)..b(0.02)....l(0.6)    ... ..z(0.01)  --> recognized as 'l'
> > > > a(0.01)..b(0.02)....l(0.6)  ...   ..z(0.01)  --> recognized as 'l'
> > > > a(0.01)..b(0.02)....o(0.8)..     ..z(0.01)  --> recognized as 'o'
> >
> > > > While working with this problem and studying the code I have found 
> the
> > > > following information:
> > > > The sequence execution of significant functions through different
> > > > programs are as follows (in standard scenario)
> >
> > > > TessBaseAPI::TesseractRectUNLV (baseapi.cpp)
> > > > TessBaseAPI::Recognize (baseapi.cpp)
> > > > recog_all_words (control.cpp)
> > > > classify_word_pass1 (control.cpp)
> > > > tess_segment_pass1 (tessbox.cpp)
> > > > recog_word (tfacepp.cpp)
> > > > recog_word_recursive (tfacepp.cpp)
> > > > cc_recog (tfacepp.cpp)
> > > > chop_word_main (chopper.cpp)
> > > > etc...
> > > > etc...
> >
> > > > Now, this chop_word_main returns CHOICES_LIST..that is a possible 
> list
> > > > of words accoding to best scores..
> > > > Can anybody help me here, how to get this list..
> > > > or the choice values for each character blob ..they are in float..
> > > > Generating a separate output file from here may also help me.
> >
> > > > I am somehow confused in here..
> > > > For each change in a program, shall I have to rebuild the whole
> > > > application (takes lot of time).
> > > > I am working in VC++.
> >
> > > > basu.- Hide quoted text -
> >
> > - Show quoted text -
>
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: Some Clue on Generating Probablity scores for each character/word

Reply via email to