Hi Pierre, The "setNumReduceTasks" method is for setting the number of reduce tasks to launch, it's equal to set the "mapred.reduce.tasks" parameter, while the "mapred.tasktracker.reduce.tasks.maximum" parameter decides the number of tasks running *concurrently* on one node. And as Amareshwari mentioned, the "mapred.tasktracker.map/reduce.tasks.maximum" is a cluster configuration which could not be set per job. If you set mapred.tasktracker.map.tasks.maximum to 20, and the overall number of map tasks is larger than 20*<nodes number>, there would be 20 map tasks running concurrently on a node. As I know, you probably need to restart the tasktracker if you truely need to change the configuration.
Best Regards, Carp 2010/6/30 Pierre ANCELOT <[email protected]> > Sure, but not the number of tasks running concurrently on a node at the > same > time. > > > > On Wed, Jun 30, 2010 at 1:57 PM, Ted Yu <[email protected]> wrote: > > > The number of map tasks is determined by InputSplit. > > > > On Wednesday, June 30, 2010, Pierre ANCELOT <[email protected]> wrote: > > > Hi, > > > Okay, so, if I set the 20 by default, I could maybe limit the number of > > > concurrent maps per node instead? > > > job.setNumReduceTasks exists but I see no equivalent for maps, though I > > > think there was a setNumMapTasks before... > > > Was it removed? Why? > > > Any idea about how to acheive this? > > > > > > Thank you. > > > > > > > > > On Wed, Jun 30, 2010 at 12:08 PM, Amareshwari Sri Ramadasu < > > > [email protected]> wrote: > > > > > >> Hi Pierre, > > >> > > >> "mapred.tasktracker.map.tasks.maximum" is a cluster level > configuration, > > >> cannot be set per job. It is loaded only while bringing up the > > TaskTracker. > > >> > > >> Thanks > > >> Amareshwari > > >> > > >> On 6/30/10 3:05 PM, "Pierre ANCELOT" <[email protected]> wrote: > > >> > > >> Hi everyone :) > > >> There's something I'm probably doing wrong but I can't seem to figure > > out > > >> what. > > >> I have two hadoop programs running one after the other. > > >> This is done because they don't have the same needs in term of > processor > > in > > >> memory, so by separating them I optimize each task better. > > >> Fact is, I need for the first job on every node > > >> mapred.tasktracker.map.tasks.maximum set to 12. > > >> For the second task, I need it to be set to 20. > > >> so by default I set it to 12 and in the second job's code, I set this: > > >> > > >> Configuration hadoopConfiguration = new Configuration(); > > >> > > hadoopConfiguration.setInt("mapred.tasktracker.map.tasks.maximum", > > >> 20); > > >> > > >> But when running the job, instead of having the 20 tasks on each node > as > > >> expected, I have 12.... > > >> Any idea please? > > >> > > >> Thank you. > > >> Pierre. > > >> > > >> > > >> -- > > >> http://www.neko-consulting.com > > >> Ego sum quis ego servo > > >> "Je suis ce que je protège" > > >> "I am what I protect" > > >> > > >> > > > > > > > > > -- > > > http://www.neko-consulting.com > > > Ego sum quis ego servo > > > "Je suis ce que je protège" > > > "I am what I protect" > > > > > > > > > -- > http://www.neko-consulting.com > Ego sum quis ego servo > "Je suis ce que je protège" > "I am what I protect" >
