Thanks a million for the information. Cheers, jfcg
> Date: Fri, 28 Aug 2009 09:37:57 -0700 > From: [email protected] > To: [email protected] > Subject: Re: String clustering and other newbie questions > > Juan Francisco Contreras Gaitan wrote: > > Hello, > > > > I would like to do some clustering by using Hadoop and I found Mahout. I am > > really impressed, but as a newbie I got stuck and I have several questions. > > The idea is to do string clustering: I have properties values expressed as > > strings of some resources, and I would like to aggregate these resources. I > > use Eclipse as IDE, and I have two Mahout working projects, one with > > release version (0.1) and the other one with SVN version. I am able to > > compile examples and to run them on my own Hadoop cluster. I have focused > > on Synthetic Control Data example using Canopy algorithm because of its > > similarity to my problem. > > > > - on release version with default parameter values I get all the items on > > the same cluster (C1), is it normal? > There was an issue with hadoop 0.19 & above running combiners both on > the map side and the reduce side which causes this behavior in the > released code. Your best bet would be to use the trunk version. > > adil > _________________________________________________________________ ¿Quieres los nuevos emoticonos en 3D? ¡Descárgatelos gratis! http://www.vivelive.com/emoticonos3d/index2.html
