Hi Tom,

I just started using TesseractExtractResult() with Tesseract version
3.0, an API described in the header file as being part of the "OCRopus
add-on" but as far as I can tell, it is using the same training data
as Tesseract (eng.traineddata) and appears to use Tesseract. Yet you
seem to say that OCRopus is not using Tesseract - please clarify.

Thanks,
Patrick

On Aug 17, 8:29 am, Thomas Breuel <[email protected]> wrote:
> > I was wondering if OCRopus still uses Tesseract for line recognition?
> > From what I gather in the release notes for 0.4 (and from what I have
> > determined from putting print statements in the code to follow the
> > execution path), OCRopus no longer uses Tesseract, but rather a new
> > line recognizer created by you guys.
>
> Correct.
>
> > If this is the case, could you
> > provide an overview of the changes required to have it again call
> > Tesseract? I thought it would be a simple one line change in ocr-
> > commands.cc by including the tesseract header and in the
> > main_lines2fsts( ) method changing:
>
> > linerec = glinerec::make_Linerec();
>
> > to
>
> > linerec = make_TesseractRecognizeLine();
>
> Unfortunately, interfacing with Tesseract isn't easy; that's why we
> don't have it in the default build anymore.
>
> There is a separate subproject for a Tesseract interface called ocrotess here:
>
> http://iupr1.cs.uni-kl.de/cgi-bin/hgwebdir.cgi/ocrotess/
>
> > Currently Tesseract is providing us better results for our images than
> > OCRopus is, but we would like to see the results that OCRopus gives
> > when it is using Tesseract.
>
> It's pointless to carry out performance comparisons between OCRopus
> and Tesseract right now; the models shipping with the OCRopus
> recognizer have been trained on only a small number of characters and
> styles.  They will perform well on some styles and poorly on others,
> depending on resolution and fonts.
>
> Furthermore, for book recognition, you should use book-adaptive
> recognition with OCRopus, which results in substantial improvements in
> recognition rates.
>
> Tom

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to