What you want to do can be accomplished in the scheduler. Take a look
at the fair scheduler, specifically the user extensible options. There
you will find the ability to add some extra logic for deciding if a
task can be launched on a per job basis. Could be as simple as
deciding a particular job can't launch more than 12 tasks at a time.

Capacity scheduler might be able to do this too, but I'm not sure.

On Wednesday, June 30, 2010, Pierre ANCELOT <[email protected]> wrote:
> ok, well, thanks...
> I truely hoped a solution would exist for this.
> Thanks.
>
> Pierre.
>
> On Wed, Jun 30, 2010 at 3:56 PM, Yu Li <[email protected]> wrote:
>
>> Hi Pierre,
>>
>> The "setNumReduceTasks" method is for setting the number of reduce tasks to
>> launch, it's equal to set the "mapred.reduce.tasks" parameter, while the
>> "mapred.tasktracker.reduce.tasks.maximum" parameter decides the number of
>> tasks running *concurrently* on one node.
>> And as Amareshwari mentioned, the
>> "mapred.tasktracker.map/reduce.tasks.maximum" is a cluster configuration
>> which could not be set per job. If you set
>> mapred.tasktracker.map.tasks.maximum to 20, and the overall number of map
>> tasks is larger than 20*<nodes number>, there would be 20 map tasks running
>> concurrently on a node. As I know, you probably need to restart the
>> tasktracker if you truely need to change the configuration.
>>
>> Best Regards,
>> Carp
>>
>> 2010/6/30 Pierre ANCELOT <[email protected]>
>>
>> > Sure, but not the number of tasks running concurrently on a node at the
>> > same
>> > time.
>> >
>> >
>> >
>> > On Wed, Jun 30, 2010 at 1:57 PM, Ted Yu <[email protected]> wrote:
>> >
>> > > The number of map tasks is determined by InputSplit.
>> > >
>> > > On Wednesday, June 30, 2010, Pierre ANCELOT <[email protected]>
>> wrote:
>> > > > Hi,
>> > > > Okay, so, if I set the 20 by default, I could maybe limit the number
>> of
>> > > > concurrent maps per node instead?
>> > > > job.setNumReduceTasks exists but I see no equivalent for maps, though
>> I
>> > > > think there was a setNumMapTasks before...
>> > > > Was it removed? Why?
>> > > > Any idea about how to acheive this?
>> > > >
>> > > > Thank you.
>> > > >
>> > > >
>> > > > On Wed, Jun 30, 2010 at 12:08 PM, Amareshwari Sri Ramadasu <
>> > > > [email protected]> wrote:
>> > > >
>> > > >> Hi Pierre,
>> > > >>
>> > > >> "mapred.tasktracker.map.tasks.maximum" is a cluster level
>> > configuration,
>> > > >> cannot be set per job. It is loaded only while bringing up the
>> > > TaskTracker.
>> > > >>
>> > > >> Thanks
>> > > >> Amareshwari
>> > > >>
>> > > >> On 6/30/10 3:05 PM, "Pierre ANCELOT" <[email protected]> wrote:
>> > > >>
>> > > >> Hi everyone :)
>> > > >> There's something I'm probably doing wrong but I can't seem to
>> figure
>> > > out
>> > > >> what.
>> > > >> I have two hadoop programs running one after the other.
>> > > >> This is done because they don't have the same needs in term of
>> > processor
>> > > in
>> > > >> memory, so by separating them I optimize each task better.
>> > > >> Fact is, I need for the first job on every node
>> > > >> mapred.tasktracker.map.tasks.maximum set to 12.
>> > > >> For the second task, I need it to be set to 20.
>> > > >> so by default I set it to 12 and in the second job's code, I set
>> this:
>> > > >>
>> > > >>        Configuration hadoopConfiguration = new Configuration();
>> > > >>
>> > >  hadoopConfiguration.setInt("mapred.tasktracker.map.tasks.maximum",
>> > > >> 20);
>> > > >>
>> > > >> But when running the job, instead of having the 20 tasks on each
>> node
>> > as
>> > > >> expected, I have 12....
>> > > >> Any idea please?
>> > > >>
>> > > >> Thank you.
>> > > >> Pierre.
>> > > >>
>> > --
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>

Reply via email to