And to answer the question about KMeans configuration :

Kmeans has two jobs :
1) builClusters : has a reducer and has no limitation on the number of
reducer tasks
2) clusterData : executes if runClustering = true, has no reducer tasks

On 11-03-2012 09:10, Paritosh Ranjan wrote:
> Can you run K-means jobs again ( all with the same block size ) and give
> same statistics for :
>
> a) only 1 job running
> b) 2 jobs running simultaneously
> c) 5 jobs running simultaneously
>
> On 10-03-2012 21:08, WangRamon wrote:
>>
>>
>> Hi All  I submit 5  K-Means Jobs simultaneously, my Hadoop cluster have 42 
>> map and 42 reduce slots configured, I set the default reduce task per job as 
>> 73 (42 * 1.75), I find there are always about 12 of the reduce tasks are 
>> running at any time although there are 73 reduce tasks created for each of 
>> the K-Means job and i do have 42 reduce slots, it means at anytime i have 
>> about 30 reduce slots free. So i tried RecommenderJob from mahout again, i 
>> remember that job will use all my slots in my previouse test, and YES for 
>> this time, "RowSimilarityJob-CooccurrencesMapper-Reducer" do use all the 
>> slots 42 reduce and 42 map, so I'm wondering is that something configured in 
>> Mahout which cause this strange behavior? Any suggestions? Thanks in 
>> advance.   Btw, i'm using mahout-0.6 release. Cheers Ramon                   
>>                     

Reply via email to