The answer to that question depends on several factors.  Tesseract is
fairly mature and works on arbitrary binary documents.  Tesseract
until 3.0 doesn't work well on isolated lines, but it also didn't have
much in the way of layout analysis.   Tesseract 3.0 offers layout
analysis, a neural network recognizer, and improved language modeling.

OCRopus has not had a stable release yet.  Its layout analysis is
probably better than Tesseracts.  Its text recognition isn't as good
as Tesseract's yet, but it's rapidly improving.  OCRopus also contains
a whole range of new technologies for page segmentation,
preprocessing, and language modeling.    Our long term plan is to make
Tesseract available through OCR as well, once the 3.0 release and APIs
are stable.

OCRopus has largely moved to Python now, which has speeded up
development and makes it easier to create custom solutions.

The upshot is: both solutions are going to be a lot of work, and they
both have their limitations.  If Tesseract gets your job done, just
use it for the time being.

Tom

On May 13, 6:06 pm, Christoph <[email protected]>
wrote:
> Hi,
>
> i am new to the ocropus-project, so i've got a basic question. What
> are the major benefits of using ocropus rather than just tesseract, if
> i only want to train the ocr-engine and using this data to recognize
> text inside image-files which were already preprocessed (binarization,
> segmentation, ...), discounting postprocessing like semantic analysis
> and so on?
>
> --
> You received this message because you are subscribed to the Google Groups 
> "ocropus" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group 
> athttp://groups.google.com/group/ocropus?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to