And to answer the question about KMeans configuration : Kmeans has two jobs : 1) builClusters : has a reducer and has no limitation on the number of reducer tasks 2) clusterData : executes if runClustering = true, has no reducer tasks
On 11-03-2012 09:10, Paritosh Ranjan wrote: > Can you run K-means jobs again ( all with the same block size ) and give > same statistics for : > > a) only 1 job running > b) 2 jobs running simultaneously > c) 5 jobs running simultaneously > > On 10-03-2012 21:08, WangRamon wrote: >> >> >> Hi All I submit 5 K-Means Jobs simultaneously, my Hadoop cluster have 42 >> map and 42 reduce slots configured, I set the default reduce task per job as >> 73 (42 * 1.75), I find there are always about 12 of the reduce tasks are >> running at any time although there are 73 reduce tasks created for each of >> the K-Means job and i do have 42 reduce slots, it means at anytime i have >> about 30 reduce slots free. So i tried RecommenderJob from mahout again, i >> remember that job will use all my slots in my previouse test, and YES for >> this time, "RowSimilarityJob-CooccurrencesMapper-Reducer" do use all the >> slots 42 reduce and 42 map, so I'm wondering is that something configured in >> Mahout which cause this strange behavior? Any suggestions? Thanks in >> advance. Btw, i'm using mahout-0.6 release. Cheers Ramon >>
