Here is the configuration: <property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>14</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>14</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>73</value>
</property>
Each node has a RAM of 32GB, i think it should be fine to have the above
configuartion.
> Date: Sat, 10 Mar 2012 22:31:44 -0700
> From: [email protected]
> To: [email protected]
> Subject: Re: Not all Mapper/Reducer slots are taken when running K-Means
> cluster
>
> What's your Hadoop config in terms of the maximum number of reducers?
> It's a function of your available RAM on each node and numbers of nodes.
>
> On 3/10/12 8:55 PM, WangRamon wrote:
> > Hi Paritosh I did the tests with 1 job and 5 jobs, they all have the
> > same problem, the job i'm running is the buildClusters one, I can see there
> > are 73 reduce tasks created from the monitor GUI, but only 12 of them are
> > running at any time (the rest are in pending state), the task finished very
> > quickly, it's about no more than 18 seconds to finish every reduce task, so
> > maybe that's the cause? Thanks Cheers Ramon
> > > Date: Sun, 11 Mar 2012 09:14:15 +0530
> >> From: [email protected]
> >> To: [email protected]
> >> Subject: Re: Not all Mapper/Reducer slots are taken when running K-Means
> >> cluster
> >>
> >> And to answer the question about KMeans configuration :
> >>
> >> Kmeans has two jobs :
> >> 1) builClusters : has a reducer and has no limitation on the number of
> >> reducer tasks
> >> 2) clusterData : executes if runClustering = true, has no reducer tasks
> >>
> >> On 11-03-2012 09:10, Paritosh Ranjan wrote:
> >>> Can you run K-means jobs again ( all with the same block size ) and give
> >>> same statistics for :
> >>>
> >>> a) only 1 job running
> >>> b) 2 jobs running simultaneously
> >>> c) 5 jobs running simultaneously
> >>>
> >>> On 10-03-2012 21:08, WangRamon wrote:
> >>>>
> >>>> Hi All I submit 5 K-Means Jobs simultaneously, my Hadoop cluster have
> >>>> 42 map and 42 reduce slots configured, I set the default reduce task per
> >>>> job as 73 (42 * 1.75), I find there are always about 12 of the reduce
> >>>> tasks are running at any time although there are 73 reduce tasks created
> >>>> for each of the K-Means job and i do have 42 reduce slots, it means at
> >>>> anytime i have about 30 reduce slots free. So i tried RecommenderJob
> >>>> from mahout again, i remember that job will use all my slots in my
> >>>> previouse test, and YES for this time,
> >>>> "RowSimilarityJob-CooccurrencesMapper-Reducer" do use all the slots 42
> >>>> reduce and 42 map, so I'm wondering is that something configured in
> >>>> Mahout which cause this strange behavior? Any suggestions? Thanks in
> >>>> advance. Btw, i'm using mahout-0.6 release. Cheers Ramon
> >>>>
> >
>