Hi all,

I was wondering if OCRopus still uses Tesseract for line recognition?
>From what I gather in the release notes for 0.4 (and from what I have
determined from putting print statements in the code to follow the
execution path), OCRopus no longer uses Tesseract, but rather a new
line recognizer created by you guys. If this is the case, could you
provide an overview of the changes required to have it again call
Tesseract? I thought it would be a simple one line change in ocr-
commands.cc by including the tesseract header and in the
main_lines2fsts( ) method changing:

linerec = glinerec::make_Linerec();

to

linerec = make_TesseractRecognizeLine();

but quite a few hours trying to get OCRopus to compile after making
this change has proved me wrong. On the other hand, if I am completely
mistaken and OCRopus still uses Tesseract, could you point me to the
file where OCRopus is calling Tesseract?

The group I am working with has been comparing OCRopus and Tesseract,
and we noticed that the text generated by OCRopus when following the
whole OCR process (i.e. ocropus book2pages dir book.tif; ocropus
pages2lines dir; ocropus lines2fsts dir; ocropus fsts2text dir)
differs from what we get when taking the lines produced by following
the procedure up to pages2lines and then feeding them into Tesseract,
so we were wondering what is up.

Currently Tesseract is providing us better results for our images than
OCRopus is, but we would like to see the results that OCRopus gives
when it is using Tesseract.

Thanks,

John
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to