Hello,

 

I use mahout to do text clustering

 

my PC device and sofeware is below

 

server: 

CPU:Intel Xeon E5-2620 2GHz,Ram:64GB 

 

software:

unbuntu-12.4.1 on VirtualBox,hadoop-1.0.4,mahout-0.7

 

I use canopy alogrithm to clustering 80000 txt

but it run for a long time, just need two or three weeks to finish it...

 

but I had found CPU utilitation just below 20%...

 

I have found someone also has this problem,

http://mail-archives.apache.org/mod_mbox/mahout-user/201212.mbox/%3C79595651
86420075099@unknownmsgid%3E#archives

 

but I still doesn't know how to accelerate it,

on the other hand, is some parameter setup I got loss?

or the server is not powerful to run this job?

 

someone can give me a direction? Thanks a lot.

 

Fisher

 

 

 

Reply via email to