Hello do you input to tesseract adjacent word, i mean do you input letters which have concrete connection between them or discrete letters, independent blobs? I think the "converter" experience is excellent. Acoording to your reply i am going to decide to use opencv to chop the letters or input adjacent letter blobs-words to tesseract., Thanks in advance.
On Wednesday, October 3, 2007 10:08:54 AM UTC+3, Basu wrote: > > Thanks Glen for your suggestions.. > I have backtracked lot of programs and finally got the desired result > few minutes back.. > in dict\choices.cpp print_choices function does this task of printing > probability and certainty values for each character (blob) in the > input image > One can invoke this from wordrec\wordclass.cpp classify_blob function > there is a piece of code like this at the bottom of the function. > > #ifndef GRAPHICS_DISABLED > if (display_ratings && string) > print_choices(string, rating); > > if (blob_pause) > window_wait(blob_window); > #endif > > So you can use this function function print_choices(string, rating); > and get a probabilistic output. > Please Note: I have only used tprinf for printing output to my log > file > A log file listing (along with my backtracking listings) is give below > for reference. > I gave a simple bmp file as input with a handwritten text basu written > in it. > One can easily customize the output from here I suppose. > > =================================================================== > Tesseract Open Source OCR Engine > tessedit_serial_unlv=0 > I am Here in 3rd Else > I am Here in baseapi.cpp > I am Here in control.cpp > control.cpp->classify_word_pass1 > tessbox.cpp->tess_segment_pass1 > tfacepp.cpp->recog_word > tfacepp.cpp->recog_word_recursive > tface.cpp->cc_recog > chopper.cpp->chop_word_main: > chop_word: |-> b <- 53.66 -7.01| t 61.22 > -8.00 > chop_word: |-> a <- 40.80 -8.00 > chop_word: |-> s <- 27.28 -8.46| o 30.09 > -9.33| z 30.40 -9.43| e 36.22 -11.23 > chop_word: |-> u <- 46.97 -9.08| w 51.38 > -9.93| a 56.72 -10.96 > OUTPUT:Best-Choice:basu > OUTPUT:Blob Choices:4 > tfacepp.cpp->recog_word > tfacepp.cpp->recog_word_recursive > tface.cpp->cc_recog > chopper.cpp->chop_word_main: > chop_word: |-> b <- 53.66 -7.01| t 61.22 > -8.00 > chop_word: |-> a <- 40.80 -8.00 > chop_word: |-> s <- 27.28 -8.46| o 30.09 > -9.33| z 30.40 -9.43| e 36.22 -11.23 > chop_word: |-> u <- 46.97 -9.08| w 51.38 > -9.93| a 56.72 -10.96 > basu > ========================================================================== > > Hope this will be helpful info for somebody like me..working with > handwritten data and want some kind of probability scores for each > character. > Thanks all, > Basu > > > On Sep 28, 9:21 pm, "[email protected]" <[email protected]> wrote: > > The dll return a confidence value for the word. There is a define that > > the dll turns on to do this. You may trace this value backwards to > > find what you are looking for. > > > > On Sep 28, 8:14 am, "[email protected]" <[email protected]> wrote: > > > > > > > > > The dll return a confidence value for the word. There is a define that > > > the dll turns on to do this. You may trace this value backwards to > > > find what you are looking for. > > > > > On Sep 28, 5:51 am, Basu <[email protected]> wrote: > > > > > > Hi, > > > > > > I am trying hard on generating some probability scores for each > > > > character in a word. > > > > As I am working with handwritten words, it may be useful information. > > > > > > e.g.,If I give input as "hello" as a handwritten bmp/tif word image > > > > Now I am getting output say, "nollo". (fixed..no alternative > > > > suggestions) > > > > I want to generate an output inthe following probabilistic form > (shown > > > > vertically for each character in word): > > > > a(0.01)..b(0.02)....n(0.6) .. ..z(0.01) --> recognized as 'n' > > > > a(0.01)..b(0.02)....o(0.7) .. ..z(0.01) --> recognized as 'o' > > > > a(0.01)..b(0.02)....l(0.6) ... ..z(0.01) --> recognized as 'l' > > > > a(0.01)..b(0.02)....l(0.6) ... ..z(0.01) --> recognized as 'l' > > > > a(0.01)..b(0.02)....o(0.8).. ..z(0.01) --> recognized as 'o' > > > > > > While working with this problem and studying the code I have found > the > > > > following information: > > > > The sequence execution of significant functions through different > > > > programs are as follows (in standard scenario) > > > > > > TessBaseAPI::TesseractRectUNLV (baseapi.cpp) > > > > TessBaseAPI::Recognize (baseapi.cpp) > > > > recog_all_words (control.cpp) > > > > classify_word_pass1 (control.cpp) > > > > tess_segment_pass1 (tessbox.cpp) > > > > recog_word (tfacepp.cpp) > > > > recog_word_recursive (tfacepp.cpp) > > > > cc_recog (tfacepp.cpp) > > > > chop_word_main (chopper.cpp) > > > > etc... > > > > etc... > > > > > > Now, this chop_word_main returns CHOICES_LIST..that is a possible > list > > > > of words accoding to best scores.. > > > > Can anybody help me here, how to get this list.. > > > > or the choice values for each character blob ..they are in float.. > > > > Generating a separate output file from here may also help me. > > > > > > I am somehow confused in here.. > > > > For each change in a program, shall I have to rebuild the whole > > > > application (takes lot of time). > > > > I am working in VC++. > > > > > > basu.- Hide quoted text - > > > > - Show quoted text - > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

