Re: Does OCRopus still use Tesseract for line recognition and...

Thomas Breuel Mon, 17 Aug 2009 06:11:20 -0700

> I was wondering if OCRopus still uses Tesseract for line recognition?
> From what I gather in the release notes for 0.4 (and from what I have
> determined from putting print statements in the code to follow the
> execution path), OCRopus no longer uses Tesseract, but rather a new
> line recognizer created by you guys.


Correct.

> If this is the case, could you
> provide an overview of the changes required to have it again call
> Tesseract? I thought it would be a simple one line change in ocr-
> commands.cc by including the tesseract header and in the
> main_lines2fsts( ) method changing:
>
> linerec = glinerec::make_Linerec();
>
> to
>
> linerec = make_TesseractRecognizeLine();

Unfortunately, interfacing with Tesseract isn't easy; that's why we
don't have it in the default build anymore.

There is a separate subproject for a Tesseract interface called ocrotess here:

http://iupr1.cs.uni-kl.de/cgi-bin/hgwebdir.cgi/ocrotess/

> Currently Tesseract is providing us better results for our images than
> OCRopus is, but we would like to see the results that OCRopus gives
> when it is using Tesseract.

It's pointless to carry out performance comparisons between OCRopus
and Tesseract right now; the models shipping with the OCRopus
recognizer have been trained on only a small number of characters and
styles.  They will perform well on some styles and poorly on others,
depending on resolution and fonts.

Furthermore, for book recognition, you should use book-adaptive
recognition with OCRopus, which results in substantial improvements in
recognition rates.

Tom

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Does OCRopus still use Tesseract for line recognition and...

Reply via email to