so basically if I understand correctly you want to limit the # parallel execution of reducers only for this job?
On Tue, Apr 30, 2013 at 4:02 PM, Han JU <[email protected]> wrote: > Thanks. > > In fact I don't want to set reducer or mapper numbers, they are fine. > I want to set the reduce slot capacity of my cluster when it executes my > specific job. Say I have 100 reduce tasks for this job, I want my cluster > to execute 4 of them in the same time, not 8 of them in the same time, only > for this specific job. > So I set mapred.tasktracker.reduce.tasks.maximum to 4 and submit the job. > This conf is well received by the job, but ignored by hadoop .. > > Any idea why is this? > > > 2013/4/30 Nitin Pawar <[email protected]> > >> The *mapred*.*tasktracker*.*reduce*.*tasks*.*maximum* parameter sets the >> maximum number of reduce tasks that may be run by an individual TaskTracker >> server at one time. This is not per job configuration. >> >> he number of map tasks for a given job is driven by the number of input >> splits and not by the mapred.map.tasks parameter. For each input split a >> map task is spawned. So, over the lifetime of a mapreduce job the number of >> map tasks is equal to the number of input splits. mapred.map.tasks is just >> a hint to the InputFormat for the number of maps >> >> If you want to set max number of maps or reducers per job then you can >> set the hints by using the job object you created >> job.setNumMapTasks() >> >> Note this is just a hint and again the number will be decided by the >> input split size. >> >> >> On Tue, Apr 30, 2013 at 3:39 PM, Han JU <[email protected]> wrote: >> >>> Thanks Nitin. >>> >>> What I need is to set slot only for a specific job, not for the whole >>> cluster conf. >>> But what I did does NOT work ... Have I done something wrong? >>> >>> >>> 2013/4/30 Nitin Pawar <[email protected]> >>> >>>> The config you are setting is for job only >>>> >>>> But if you want to reduce the slota on tasktrackers then you will need >>>> to edit tasktracker conf and restart tasktracker >>>> On Apr 30, 2013 3:30 PM, "Han JU" <[email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> I want to change the cluster's capacity of reduce slots on a per job >>>>> basis. Originally I have 8 reduce slots for a tasktracker. >>>>> I did: >>>>> >>>>> conf.set("mapred.tasktracker.reduce.tasks.maximum", "4"); >>>>> ... >>>>> Job job = new Job(conf, ...) >>>>> >>>>> >>>>> And in the web UI I can see that for this job, the max reduce tasks is >>>>> exactly at 4, like I set. However hadoop still launches 8 reducer per >>>>> datanode ... why is this? >>>>> >>>>> How could I achieve this? >>>>> -- >>>>> *JU Han* >>>>> >>>>> Software Engineer Intern @ KXEN Inc. >>>>> UTC - Université de Technologie de Compiègne >>>>> * **GI06 - Fouille de Données et Décisionnel* >>>>> >>>>> +33 0619608888 >>>>> >>>> >>> >>> >>> -- >>> *JU Han* >>> >>> Software Engineer Intern @ KXEN Inc. >>> UTC - Université de Technologie de Compiègne >>> * **GI06 - Fouille de Données et Décisionnel* >>> >>> +33 0619608888 >>> >> >> >> >> -- >> Nitin Pawar >> > > > > -- > *JU Han* > > Software Engineer Intern @ KXEN Inc. > UTC - Université de Technologie de Compiègne > * **GI06 - Fouille de Données et Décisionnel* > > +33 0619608888 > -- Nitin Pawar
