Hi Leo,
>
> however I just tried on image segmenation using VORONOI algorithm and
> it appear to me the algorithm tends to erase off some componemnts
> after segmenation,


Voronoi may erase all components that it considers belonging to noise.
Usually these are very small sized components, but you can disable this by a
parameter (bool  remove_noise).


> also is the current VORONOI algoritm similar to the
> XY-CUT algoritm which haven't yet able to provide information for each
> block it found?


Yes, Voronoi also does not provide information about columns in the page.


> I have upload a "VORONOI.png" for your reference, the original image
> is "01_1_1_2.PNG"
>

There have been some bug fixes in both Voronoi and x-y cut algorithm after
the 0.2 release. Please use 0.3 release to get the latest version. I tried
this image and looks fine to me (I am using svn version, which is the same
as 0.3 release as far as Voronoi and x-y cut are concerned), though I needed
to binarize the image before giving it to Voronoi.

Cheers,
Faisal


>
> Cheers,
> Leo
>
> On 11月19日, 上午12時25分, "Faisal Shafait" <[EMAIL PROTECTED]> wrote:
> > Hi Leo,
> > Since you are assigning everything to the first column, you will get
> > everything as the first column :-)
> >
> > You need to write an algorithm that groups zones into text columns. You
> can
> > find such an algorithm in ocr-layout-rast/ocr-layout-rast.cc (sorry the
> file
> > is a big mess at the moment, but I am working on refactoring it). Please
> > look at the method:
> >    void SegmentPageByRAST::getCol(rectarray &columns, rectarray
> &paragraphs)
> > that takes a rectarray of paragraphs and groups them into text column
> using
> > their aligment and position on the page. You can treat the zones returned
> by
> > x-y cut as paragraphs and try this algorithm. If this does not work then
> you
> > have to write some algorithm on your own.
> >
> > Cheers,
> > Faisal
> >
> > 2008/11/18 Leo <[EMAIL PROTECTED]>
> >
> >
> >
> > > Hi Faisal
> >
> > > I have do the change that you suggest and it can pass the
> > > check_page_segmentation() function
> > > however when I try to do the RegionExtractor for text columns/
> > > paragraphs/ lines extraction it still didn't work quite well.
> > > it return the whole image as one columns when the image actually
> > > contain two columns, is there any way to work-around it?
> > > thanks for the help
> >
> > > Cheers,
> > > Leo
> >
> > > On 11月18日, 上午1時37分, "Faisal Shafait" <[EMAIL PROTECTED]> wrote:
> > > > Hi Leo,
> > > > Thanks for reporting this bug.
> >
> > > > XYCUTS and Voronoi algorithms divide a page into several blocks. They
> do
> > > not
> > > > define the role of these blocks whether the block contains
> text/images
> > > etc.
> > > > and provide no information about the columnar structure of the
> document.
> > > > Therefore they do not pass the check_page_segmentation() function as
> the
> > > > function checks for proper encoding of text columns/ paragraphs/
> lines
> > > etc.
> >
> > > > A work-around for the moment would be to assign all blocks to the
> first
> > > > column. I have changed that in the svn version. You just need to
> replace:
> > > >             int color = i+1;
> > > > in the segment() method with:
> > > >             int color = (i+1) | (0x00010000);
> >
> > > > Cheers,
> > > > Faisal
> >
> > > > On Mon, Nov 17, 2008 at 10:38 AM, Leo <[EMAIL PROTECTED]> wrote:
> >
> > > > > Hi All,
> >
> > > > > I am currently using version 0.2 and try out the new XYCUTS page
> > > > > segmentation algorithm make_SegmentPageByXYCUTS(), however I found
> > > > > that after segmentation the intarray didn't pass the
> > > > > check_page_segmentation() function, therefore I can not use it for
> > > > > RegionExtractor, did anyone had the similar problem?
> >
> > > > > Cheers,
> > > > > Leo
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to