kmeans local vs mapreduce difference

2013-04-25 Thread Mihai Josan
Hi, I'm running a kmeans clusterization on a small sequence (around 50 KB) file on a 2 node cluster. The block size for this file is 20 KB, so it uses 3 mappers I am using CDH4.2.0 with yarn and Mahout 0.7 If the job runs local on only one node the used CPU is around 20% and the job finishes in

RE: RE: how to use a custom distance measure with kmeans?

2013-02-19 Thread Mihai Josan
Hello, I managed to resolve the problem without modifying the Mahout script. I inserted my classes into the mahout job jar (mahout-examples-0.7-cdh4.1.2-job.jar) and everything is ok now. Thank you very much for your help, Mihai Josan -Original Message- From: Mihai Josan

RE: how to use a custom distance measure with kmeans?

2013-02-14 Thread Mihai Josan
specified (but this looks unlikely). I'm not sure why the mahout script has those cases starting at line 239 (especially since they are undocumented). Let us know if it works! On Wed, Feb 13, 2013 at 6:18 PM, Mihai Josan mihai.jo...@iquestgroup.com wrote: Hello, After I made the changes, I still

RE: how to use a custom distance measure with kmeans?

2013-02-13 Thread Mihai Josan
) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:169) at org.apache.mahout.common.ClassUtils.instantiateAs(ClassUtils.java:28) ... 15 more Thank you, Mihai Josan

Re: how to use a custom distance measure with kmeans?

2013-02-12 Thread Mihai Josan
(Class.java:169) at org.apache.mahout.common.ClassUtils.instantiateAs(ClassUtils.java:28) ... 15 more Is this the proper way to use the custom distance measure? or should I package the class? and how? Thank you in advance, Mihai Josan Are you getting any errors? Can you specify