Hi Faisal
I take a look in "voronoi-ocropus.h" and it seems the remove_noise has
the default vale "false" already,
however I also try changing the value by doing the following
local segmenter2 = make_SegmentPageByVORONOI()
segmenter2:set("remove_noise",0)
but it didn't seems to make any difference on segmented images, I am
currently using 0.2 but I update all the Voronoi code from svn version
and compile it in 0.2.
is there other places that needs to be change before I can disable
remove_noise?
are you able to upload a segmented image for "01_1_1_2.PNG" using
Voronoi with apply of simple_recolor() for me to see?
thanks for the help
Cheers,
Leo
On 11月19日, 下午9時23分, "Faisal Shafait" <[EMAIL PROTECTED]> wrote:
> Hi Leo,
>
>
>
> > however I just tried on image segmenation using VORONOI algorithm and
> > it appear to me the algorithm tends to erase off some componemnts
> > after segmenation,
>
> Voronoi may erase all components that it considers belonging to noise.
> Usually these are very small sized components, but you can disable this by a
> parameter (bool remove_noise).
>
> > also is the current VORONOI algoritm similar to the
> > XY-CUT algoritm which haven't yet able to provide information for each
> > block it found?
>
> Yes, Voronoi also does not provide information about columns in the page.
>
> > I have upload a "VORONOI.png" for your reference, the original image
> > is "01_1_1_2.PNG"
>
> There have been some bug fixes in both Voronoi and x-y cut algorithm after
> the 0.2 release. Please use 0.3 release to get the latest version. I tried
> this image and looks fine to me (I am using svn version, which is the same
> as 0.3 release as far as Voronoi and x-y cut are concerned), though I needed
> to binarize the image before giving it to Voronoi.
>
> Cheers,
> Faisal
>
>
>
> > Cheers,
> > Leo
>
> > On 11月19日, 上午12時25分, "Faisal Shafait" <[EMAIL PROTECTED]> wrote:
> > > Hi Leo,
> > > Since you are assigning everything to the first column, you will get
> > > everything as the first column :-)
>
> > > You need to write an algorithm that groups zones into text columns. You
> > can
> > > find such an algorithm in ocr-layout-rast/ocr-layout-rast.cc (sorry the
> > file
> > > is a big mess at the moment, but I am working on refactoring it). Please
> > > look at the method:
> > > void SegmentPageByRAST::getCol(rectarray &columns, rectarray
> > ¶graphs)
> > > that takes a rectarray of paragraphs and groups them into text column
> > using
> > > their aligment and position on the page. You can treat the zones returned
> > by
> > > x-y cut as paragraphs and try this algorithm. If this does not work then
> > you
> > > have to write some algorithm on your own.
>
> > > Cheers,
> > > Faisal
>
> > > 2008/11/18 Leo <[EMAIL PROTECTED]>
>
> > > > Hi Faisal
>
> > > > I have do the change that you suggest and it can pass the
> > > > check_page_segmentation() function
> > > > however when I try to do the RegionExtractor for text columns/
> > > > paragraphs/ lines extraction it still didn't work quite well.
> > > > it return the whole image as one columns when the image actually
> > > > contain two columns, is there any way to work-around it?
> > > > thanks for the help
>
> > > > Cheers,
> > > > Leo
>
> > > > On 11月18日, 上午1時37分, "Faisal Shafait" <[EMAIL PROTECTED]> wrote:
> > > > > Hi Leo,
> > > > > Thanks for reporting this bug.
>
> > > > > XYCUTS and Voronoi algorithms divide a page into several blocks. They
> > do
> > > > not
> > > > > define the role of these blocks whether the block contains
> > text/images
> > > > etc.
> > > > > and provide no information about the columnar structure of the
> > document.
> > > > > Therefore they do not pass the check_page_segmentation() function as
> > the
> > > > > function checks for proper encoding of text columns/ paragraphs/
> > lines
> > > > etc.
>
> > > > > A work-around for the moment would be to assign all blocks to the
> > first
> > > > > column. I have changed that in the svn version. You just need to
> > replace:
> > > > > int color = i+1;
> > > > > in the segment() method with:
> > > > > int color = (i+1) | (0x00010000);
>
> > > > > Cheers,
> > > > > Faisal
>
> > > > > On Mon, Nov 17, 2008 at 10:38 AM, Leo <[EMAIL PROTECTED]> wrote:
>
> > > > > > Hi All,
>
> > > > > > I am currently using version 0.2 and try out the new XYCUTS page
> > > > > > segmentation algorithm make_SegmentPageByXYCUTS(), however I found
> > > > > > that after segmentation the intarray didn't pass the
> > > > > > check_page_segmentation() function, therefore I can not use it for
> > > > > > RegionExtractor, did anyone had the similar problem?
>
> > > > > > Cheers,
> > > > > > Leo
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---