Hi Tom, I just started using TesseractExtractResult() with Tesseract version 3.0, an API described in the header file as being part of the "OCRopus add-on" but as far as I can tell, it is using the same training data as Tesseract (eng.traineddata) and appears to use Tesseract. Yet you seem to say that OCRopus is not using Tesseract - please clarify.
Thanks, Patrick On Aug 17, 8:29 am, Thomas Breuel <[email protected]> wrote: > > I was wondering if OCRopus still uses Tesseract for line recognition? > > From what I gather in the release notes for 0.4 (and from what I have > > determined from putting print statements in the code to follow the > > execution path), OCRopus no longer uses Tesseract, but rather a new > > line recognizer created by you guys. > > Correct. > > > If this is the case, could you > > provide an overview of the changes required to have it again call > > Tesseract? I thought it would be a simple one line change in ocr- > > commands.cc by including the tesseract header and in the > > main_lines2fsts( ) method changing: > > > linerec = glinerec::make_Linerec(); > > > to > > > linerec = make_TesseractRecognizeLine(); > > Unfortunately, interfacing with Tesseract isn't easy; that's why we > don't have it in the default build anymore. > > There is a separate subproject for a Tesseract interface called ocrotess here: > > http://iupr1.cs.uni-kl.de/cgi-bin/hgwebdir.cgi/ocrotess/ > > > Currently Tesseract is providing us better results for our images than > > OCRopus is, but we would like to see the results that OCRopus gives > > when it is using Tesseract. > > It's pointless to carry out performance comparisons between OCRopus > and Tesseract right now; the models shipping with the OCRopus > recognizer have been trained on only a small number of characters and > styles. They will perform well on some styles and poorly on others, > depending on resolution and fonts. > > Furthermore, for book recognition, you should use book-adaptive > recognition with OCRopus, which results in substantial improvements in > recognition rates. > > Tom --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
