RE: Not all Mapper/Reducer slots are taken when running K-Means cluster

WangRamon Sat, 10 Mar 2012 21:34:46 -0800
Here is the configuration:   <property>
        <name>mapred.tasktracker.map.tasks.maximum</name>
        <value>14</value>
    </property>
    <property>
        <name>mapred.tasktracker.reduce.tasks.maximum</name>
        <value>14</value>
    </property>
    <property>
        <name>mapred.reduce.tasks</name>
        <value>73</value>
    </property>
 
  Each node has a RAM of 32GB, i think it should be fine to have the above 
configuartion.
 > Date: Sat, 10 Mar 2012 22:31:44 -0700
> From: [email protected]
> To: [email protected]
> Subject: Re: Not all Mapper/Reducer slots are taken when running K-Means 
> cluster
> 
> What's your Hadoop config in terms of the maximum number of reducers?
> It's a function of your available RAM on each node and numbers of nodes.
> 
> On 3/10/12 8:55 PM, WangRamon wrote:
> > Hi Paritosh    I did the tests with 1 job and 5 jobs, they all have the 
> > same problem, the job i'm running is the buildClusters one, I can see there 
> > are 73 reduce tasks created from the monitor GUI, but only 12 of them are 
> > running at any time (the rest are in pending state), the task finished very 
> > quickly, it's about no more than 18 seconds to finish every reduce task, so 
> > maybe that's the cause? Thanks    Cheers  Ramon
> >  > Date: Sun, 11 Mar 2012 09:14:15 +0530
> >> From: [email protected]
> >> To: [email protected]
> >> Subject: Re: Not all Mapper/Reducer slots are taken when running K-Means 
> >> cluster
> >>
> >> And to answer the question about KMeans configuration :
> >>
> >> Kmeans has two jobs :
> >> 1) builClusters : has a reducer and has no limitation on the number of
> >> reducer tasks
> >> 2) clusterData : executes if runClustering = true, has no reducer tasks
> >>
> >> On 11-03-2012 09:10, Paritosh Ranjan wrote:
> >>> Can you run K-means jobs again ( all with the same block size ) and give
> >>> same statistics for :
> >>>
> >>> a) only 1 job running
> >>> b) 2 jobs running simultaneously
> >>> c) 5 jobs running simultaneously
> >>>
> >>> On 10-03-2012 21:08, WangRamon wrote:
> >>>>
> >>>> Hi All  I submit 5  K-Means Jobs simultaneously, my Hadoop cluster have 
> >>>> 42 map and 42 reduce slots configured, I set the default reduce task per 
> >>>> job as 73 (42 * 1.75), I find there are always about 12 of the reduce 
> >>>> tasks are running at any time although there are 73 reduce tasks created 
> >>>> for each of the K-Means job and i do have 42 reduce slots, it means at 
> >>>> anytime i have about 30 reduce slots free. So i tried RecommenderJob 
> >>>> from mahout again, i remember that job will use all my slots in my 
> >>>> previouse test, and YES for this time, 
> >>>> "RowSimilarityJob-CooccurrencesMapper-Reducer" do use all the slots 42 
> >>>> reduce and 42 map, so I'm wondering is that something configured in 
> >>>> Mahout which cause this strange behavior? Any suggestions? Thanks in 
> >>>> advance.   Btw, i'm using mahout-0.6 release. Cheers Ramon               
> >>>>                             
> >                                       
>
RE: Not all Mapper/Reducer slots are taken when running K-Means cluster

Reply via email to