There are several ways of interfacing with OCRopus.  By default, the
command line tools are set up for recognizing books and technical
reports.  Those recognizers will not work well on diagrams because the
layout analysis fails.

But you don't have to use the layout analysis; ocropus-linerec takes
individual images of text lines and gives you corresponding text.

There is no ready-made program for extracting text lines from diagrams
right now.  Eventually, there will be but there aren't yet.  Right
now, you still need to program that yourself.  There are some
potentially useful tools you can find in ocropy/ocrolib, but it's a
non-trivial task.

Tom

On Jan 3, 1:42 pm, Manon <[email protected]> wrote:
> Hi,
>
> I am member in a bachelor project at Hasso Plattner Institute Potsdam
> (Germany). We are about to build an online process platform and for
> this we need an OCR program which is able to extract the texts from
> pictures of process models (like BPMN, EPK etc).
>
> OCRopus is the best one we found but it can't find enough of the texts
> and often nothing at all.
>
> Will this "find text in between graphs and images"-algorithm be
> implemented in the next time (lets say until the beginning of
> february)?
> Or how much work would it be to implement this? Because if it wouldn't
> go beyond the scope of our project we could implement it ourself.
>
> Thanks in advance,
> Manon

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to