> As I understand it, one of the strengths of ocropus is the addition of
> a powerful "Document Layout Analysis" to a powerful OCR (tesseract) am
> I right?

OCRopus is really a separate project.  You can use Tesseract as a line
recognizer, but you don't have to (actually, we need to re-enable that
capability; it is currently broken).

> I believe that a positive point for the ocropus would be the
> development of a frontend (perhaps based on some frontend that already
> exists, like gscan2pdf) that would enable ordinary people to use the
> ocropus with ease.

Yes, once the API settles down.

> I also think that, if possible, adding the feature "text under the
> image" (such as ABBY FineReader, here is an picture of the
> finereader's feature in 
> Portuguese:http://www.imagebam.com/image/c1c78f93276763
> ) would be very welcome. This feature enables the scanning of old
> texts without concern for the correction of all errors, because those
> who are reading the text has access also to the original image (here
> is an exaple of "text under image" in

That's part of the DECAPOD project.  It actually does a lot more,
including token-based compression.  It also provides a web-based
frontend.

> Ocropus can read Portuguese? If not, the tesseract-ocr language files
> for Brasilian Portuguese text is compatible with ocropus?

No, the model files are completely different.  Also, we're still
debugging the Unicode support.

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to