We're reimplementing and improving a text/graphics segmentation
algorithm from Leptonica; that's a pretty standard morphology-based
algorithm.

We have also implemented text/graphics segmentation based on machine
learning.

Both of those will make it into the codebase in the future (but I
can't say exactly when).  Both work for arbitrary layouts.

For actually analyzing the resulting layouts, you need to use the
Voronoi page segmenter.  It gives less good performance on Manhattan
layouts but works on many non-Manhattan type documents.

Tom

On May 27, 7:07 am, avd <[email protected]> wrote:
> Which algorithms does Ocropus uses for separating text and graphics
> from a document image which can have arbitrary non-Manhattan layout?

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to