Hello,
I use mahout to do text clustering my PC device and sofeware is below server: CPU:Intel Xeon E5-2620 2GHz,Ram:64GB software: unbuntu-12.4.1 on VirtualBox,hadoop-1.0.4,mahout-0.7 I use canopy alogrithm to clustering 80000 txt but it run for a long time, just need two or three weeks to finish it... but I had found CPU utilitation just below 20%... I have found someone also has this problem, http://mail-archives.apache.org/mod_mbox/mahout-user/201212.mbox/%3C79595651 86420075099@unknownmsgid%3E#archives but I still doesn't know how to accelerate it, on the other hand, is some parameter setup I got loss? or the server is not powerful to run this job? someone can give me a direction? Thanks a lot. Fisher
