Hello everyone;
Tesseract from command line yields decent results. However, from API, the
results are not as good.
I've written a few c wrapper functions in order to use Tesseract from
harbour.
This code works fine but yields different and less accurate text when
compared from command line output:
handle := TessBaseAPICreate() //Using Tesseract to OCR image
IF TessBaseAPIInit3( handle, NIL, "eng" ) != 0 ; LOOP ; ENDIF //abort
if english traindata file can't be found locally.
//line below is commented to avoid program from freezing when calling
TessBaseAPIGetUTF8Text()
//TessBaseAPISetPageSegMode( handle, 3 ) //this line causes program
to freeze when calling GetUTF8Text() below
img := pixRead( ALLTRIM( cPath )+cFile )
TessBaseAPISetImage2( handle, img )
IF TessBaseAPIRecognize( handle, Nil ) != 0 ; LOOP ;ENDIF
//abort if Recognize fails
cText := TessBaseAPIGetUTF8Text( handle ) //program will freeze here
unless SetPageSegMode above is commented
I'm guessing the reason output is different is do to PSM mode defaults to
different values on command line use versus from API. However, when I
uncomment the line "TessBaseAPISetPageSegMode( handle, 3 )" to make sure
PSM is same as command line default then the program freezes when executing
TessBaseAPIGetUTF8Text( handle).
Can someone, please, help me understand what I might be doing wrong?
Thank you,
Reinaldo.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/59f6301f-2476-4545-8aaf-8af730422cb5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.