Thanks Aaron. On Wed, Apr 8, 2009 at 12:00 AM, Aaron Kimball <[email protected]> wrote:
> Amit, > > The mapred.tasktracker.map.tasks.maximum and > mapred.tasktracker.reduce.tasks.maximum properties can be controlled on a > per-host basis in their hadoop-site.xml files. With this you can configure > nodes with more/fewer cores/RAM/etc to take on varying amounts of work. > > There's no current mechanism to provide feedback to the task scheduler, > though, based on actual machine utilization in real time. > > - Aaron > > > On Tue, Apr 7, 2009 at 7:54 AM, amit handa <[email protected]> wrote: > > > Hi, > > > > Is there a way I can control number of tasks that can be spawned on a > > machine based on the machine capacity and how loaded the machine already > is > > ? > > > > My use case is as following: > > > > I have to perform task 1,task2,task3 ...task n . > > These tasks have varied CPU and memory usage patterns. > > All tasks of type task 1,task3 can take 80-90%CPU and 800 MB of RAM. > > All type of tasks task2 take only 1-2% of CPU and 5-10 MB of RAM > > > > How do i model this using Hadoop ? Can i use only one cluster for running > > all these type of tasks ? > > Shall I use different hadoop clusters for each tasktype , if yes, then > how > > do i share data between these tasks (the data can be few MB to few GB) > > > > Please suggest or point to any docs which i can dig up. > > > > Thanks, > > Amit > > >
