What you want to do can be accomplished in the scheduler. Take a look at the fair scheduler, specifically the user extensible options. There you will find the ability to add some extra logic for deciding if a task can be launched on a per job basis. Could be as simple as deciding a particular job can't launch more than 12 tasks at a time.
Capacity scheduler might be able to do this too, but I'm not sure. On Wednesday, June 30, 2010, Pierre ANCELOT <[email protected]> wrote: > ok, well, thanks... > I truely hoped a solution would exist for this. > Thanks. > > Pierre. > > On Wed, Jun 30, 2010 at 3:56 PM, Yu Li <[email protected]> wrote: > >> Hi Pierre, >> >> The "setNumReduceTasks" method is for setting the number of reduce tasks to >> launch, it's equal to set the "mapred.reduce.tasks" parameter, while the >> "mapred.tasktracker.reduce.tasks.maximum" parameter decides the number of >> tasks running *concurrently* on one node. >> And as Amareshwari mentioned, the >> "mapred.tasktracker.map/reduce.tasks.maximum" is a cluster configuration >> which could not be set per job. If you set >> mapred.tasktracker.map.tasks.maximum to 20, and the overall number of map >> tasks is larger than 20*<nodes number>, there would be 20 map tasks running >> concurrently on a node. As I know, you probably need to restart the >> tasktracker if you truely need to change the configuration. >> >> Best Regards, >> Carp >> >> 2010/6/30 Pierre ANCELOT <[email protected]> >> >> > Sure, but not the number of tasks running concurrently on a node at the >> > same >> > time. >> > >> > >> > >> > On Wed, Jun 30, 2010 at 1:57 PM, Ted Yu <[email protected]> wrote: >> > >> > > The number of map tasks is determined by InputSplit. >> > > >> > > On Wednesday, June 30, 2010, Pierre ANCELOT <[email protected]> >> wrote: >> > > > Hi, >> > > > Okay, so, if I set the 20 by default, I could maybe limit the number >> of >> > > > concurrent maps per node instead? >> > > > job.setNumReduceTasks exists but I see no equivalent for maps, though >> I >> > > > think there was a setNumMapTasks before... >> > > > Was it removed? Why? >> > > > Any idea about how to acheive this? >> > > > >> > > > Thank you. >> > > > >> > > > >> > > > On Wed, Jun 30, 2010 at 12:08 PM, Amareshwari Sri Ramadasu < >> > > > [email protected]> wrote: >> > > > >> > > >> Hi Pierre, >> > > >> >> > > >> "mapred.tasktracker.map.tasks.maximum" is a cluster level >> > configuration, >> > > >> cannot be set per job. It is loaded only while bringing up the >> > > TaskTracker. >> > > >> >> > > >> Thanks >> > > >> Amareshwari >> > > >> >> > > >> On 6/30/10 3:05 PM, "Pierre ANCELOT" <[email protected]> wrote: >> > > >> >> > > >> Hi everyone :) >> > > >> There's something I'm probably doing wrong but I can't seem to >> figure >> > > out >> > > >> what. >> > > >> I have two hadoop programs running one after the other. >> > > >> This is done because they don't have the same needs in term of >> > processor >> > > in >> > > >> memory, so by separating them I optimize each task better. >> > > >> Fact is, I need for the first job on every node >> > > >> mapred.tasktracker.map.tasks.maximum set to 12. >> > > >> For the second task, I need it to be set to 20. >> > > >> so by default I set it to 12 and in the second job's code, I set >> this: >> > > >> >> > > >> Configuration hadoopConfiguration = new Configuration(); >> > > >> >> > > hadoopConfiguration.setInt("mapred.tasktracker.map.tasks.maximum", >> > > >> 20); >> > > >> >> > > >> But when running the job, instead of having the 20 tasks on each >> node >> > as >> > > >> expected, I have 12.... >> > > >> Any idea please? >> > > >> >> > > >> Thank you. >> > > >> Pierre. >> > > >> >> > -- > http://www.neko-consulting.com > Ego sum quis ego servo > "Je suis ce que je protège" > "I am what I protect" >
