OCRopus provides page segmentation algorithms that do just this.

The C++ and Python interface is called ISegmentPage.  The command line
tool is called ocropus-pseg (you probably need to call ocropus-
binarize first).  It outputs a color image that assigns a different
color to each region.  There are multiple algorithms implementing page
segmentation; not all algorithms work for all page types.

Tom

On May 12, 8:39 am, dialer <[email protected]> wrote:
> I am new to this. I want to be able to perform this, and I wonder if
> there is any APIs or useful utils which I can use to accomplish
> this :-
>
> I want to perform what I call it as 'region analysis' ( not sure if it
> is the correct terminology ) on images, basically given any image,
> there maybe a few regions where there will be characters on it, a
> region is basically an arbitrarily sized rectangular area, each region
> is made up of a cluster of words, and one region is separated from
> another region by white space.
>
> Basically I want to split the images into regions, and then perform
> OCR on each region, as the characters found in each region are related
> information and thus I would like to store the information found in
> each region separately.
>
> Is there a way I could accomplish this ?
>
> Thank you very much for your reading.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "ocropus" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group 
> athttp://groups.google.com/group/ocropus?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to