Hi all, I was wondering if OCRopus still uses Tesseract for line recognition? >From what I gather in the release notes for 0.4 (and from what I have determined from putting print statements in the code to follow the execution path), OCRopus no longer uses Tesseract, but rather a new line recognizer created by you guys. If this is the case, could you provide an overview of the changes required to have it again call Tesseract? I thought it would be a simple one line change in ocr- commands.cc by including the tesseract header and in the main_lines2fsts( ) method changing:
linerec = glinerec::make_Linerec(); to linerec = make_TesseractRecognizeLine(); but quite a few hours trying to get OCRopus to compile after making this change has proved me wrong. On the other hand, if I am completely mistaken and OCRopus still uses Tesseract, could you point me to the file where OCRopus is calling Tesseract? The group I am working with has been comparing OCRopus and Tesseract, and we noticed that the text generated by OCRopus when following the whole OCR process (i.e. ocropus book2pages dir book.tif; ocropus pages2lines dir; ocropus lines2fsts dir; ocropus fsts2text dir) differs from what we get when taking the lines produced by following the procedure up to pages2lines and then feeding them into Tesseract, so we were wondering what is up. Currently Tesseract is providing us better results for our images than OCRopus is, but we would like to see the results that OCRopus gives when it is using Tesseract. Thanks, John --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
