Hi Faisal

I will give it try on writing an algorithm for group zones into text
columns base one the getCol function,
however I just tried on image segmenation using VORONOI algorithm and
it appear to me the algorithm tends to erase off some componemnts
after segmenation, also is the current VORONOI algoritm similar to the
XY-CUT algoritm which haven't yet able to provide information for each
block it found?

I have upload a "VORONOI.png" for your reference, the original image
is "01_1_1_2.PNG"

Cheers,
Leo

On 11月19日, 上午12時25分, "Faisal Shafait" <[EMAIL PROTECTED]> wrote:
> Hi Leo,
> Since you are assigning everything to the first column, you will get
> everything as the first column :-)
>
> You need to write an algorithm that groups zones into text columns. You can
> find such an algorithm in ocr-layout-rast/ocr-layout-rast.cc (sorry the file
> is a big mess at the moment, but I am working on refactoring it). Please
> look at the method:
>    void SegmentPageByRAST::getCol(rectarray &columns, rectarray &paragraphs)
> that takes a rectarray of paragraphs and groups them into text column using
> their aligment and position on the page. You can treat the zones returned by
> x-y cut as paragraphs and try this algorithm. If this does not work then you
> have to write some algorithm on your own.
>
> Cheers,
> Faisal
>
> 2008/11/18 Leo <[EMAIL PROTECTED]>
>
>
>
> > Hi Faisal
>
> > I have do the change that you suggest and it can pass the
> > check_page_segmentation() function
> > however when I try to do the RegionExtractor for text columns/
> > paragraphs/ lines extraction it still didn't work quite well.
> > it return the whole image as one columns when the image actually
> > contain two columns, is there any way to work-around it?
> > thanks for the help
>
> > Cheers,
> > Leo
>
> > On 11月18日, 上午1時37分, "Faisal Shafait" <[EMAIL PROTECTED]> wrote:
> > > Hi Leo,
> > > Thanks for reporting this bug.
>
> > > XYCUTS and Voronoi algorithms divide a page into several blocks. They do
> > not
> > > define the role of these blocks whether the block contains text/images
> > etc.
> > > and provide no information about the columnar structure of the document.
> > > Therefore they do not pass the check_page_segmentation() function as the
> > > function checks for proper encoding of text columns/ paragraphs/ lines
> > etc.
>
> > > A work-around for the moment would be to assign all blocks to the first
> > > column. I have changed that in the svn version. You just need to replace:
> > >             int color = i+1;
> > > in the segment() method with:
> > >             int color = (i+1) | (0x00010000);
>
> > > Cheers,
> > > Faisal
>
> > > On Mon, Nov 17, 2008 at 10:38 AM, Leo <[EMAIL PROTECTED]> wrote:
>
> > > > Hi All,
>
> > > > I am currently using version 0.2 and try out the new XYCUTS page
> > > > segmentation algorithm make_SegmentPageByXYCUTS(), however I found
> > > > that after segmentation the intarray didn't pass the
> > > > check_page_segmentation() function, therefore I can not use it for
> > > > RegionExtractor, did anyone had the similar problem?
>
> > > > Cheers,
> > > > Leo
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to