Well, I have reread Ted answer after having a look at some of the information Isabel gave me, and I think you are right. But I am not sure about the reason k-means mahout algorithm cannot be used with strings, after defining a string distance metric. Taking Jeff's advice, I could use a Map between doubles and strings: storaging doubles in all the algorithm, and retrieving the strings to compute distance in measuring steps. Could it make any sense?
Regards, jfcg > Subject: Re: String clustering and other newbie questions > From: [email protected] > Date: Tue, 1 Sep 2009 05:33:34 -0700 > To: [email protected] > > > On Sep 1, 2009, at 5:06 AM, Juan Francisco Contreras Gaitan wrote: > > > > > Ok, I see. Sorry for my unknowledge on these matters (I am going to > > read all the documentation you gave me closely). > > > > But if I understood you well, and as far as I know, Mahout has its > > own k-means implementation. Then, could I use it for my purposes > > instead of DP like setup? > > I think Ted was saying that DP is the only one that would work for > what you described, but it's also possible we aren't understanding the > problem right either. > > Obviously, one of the things we as a project need to develop more is > guidelines on which approaches work for which types of problems.. > > -Grant _________________________________________________________________ Con Vodafone disfruta de Hotmail gratis en tu móvil. ¡Pruébalo! http://serviciosmoviles.es.msn.com/hotmail/vodafone.aspx
