Re: Handling Non Homogenous tasks via Hadoop

amit handa Tue, 07 Apr 2009 20:53:53 -0700

Thanks Aaron.

On Wed, Apr 8, 2009 at 12:00 AM, Aaron Kimball <[email protected]> wrote:


> Amit,
>
> The mapred.tasktracker.map.tasks.maximum and
> mapred.tasktracker.reduce.tasks.maximum properties can be controlled on a
> per-host basis in their hadoop-site.xml files. With this you can configure
> nodes with more/fewer cores/RAM/etc to take on varying amounts of work.
>
> There's no current mechanism to provide feedback to the task scheduler,
> though, based on actual machine utilization in real time.
>
> - Aaron
>
>
> On Tue, Apr 7, 2009 at 7:54 AM, amit handa <[email protected]> wrote:
>
> > Hi,
> >
> > Is there a way I can control number of tasks that can be spawned on a
> > machine based on the machine capacity and how loaded the machine already
> is
> > ?
> >
> > My use case is as following:
> >
> > I have to perform task 1,task2,task3 ...task n .
> > These tasks have varied CPU and memory usage patterns.
> > All tasks of type task 1,task3 can take 80-90%CPU and 800 MB of RAM.
> > All type of tasks task2 take only 1-2% of CPU and 5-10 MB of RAM
> >
> > How do i model this using Hadoop ? Can i use only one cluster for running
> > all these type of tasks ?
> > Shall I use different hadoop clusters for each tasktype , if yes, then
> how
> > do i share data between these tasks (the data can be few MB to few GB)
> >
> > Please suggest or point to any docs which i can dig up.
> >
> > Thanks,
> > Amit
> >
>

Re: Handling Non Homogenous tasks via Hadoop

Reply via email to