Hi, Is there a way I can control number of tasks that can be spawned on a machine based on the machine capacity and how loaded the machine already is ?
My use case is as following: I have to perform task 1,task2,task3 ...task n . These tasks have varied CPU and memory usage patterns. All tasks of type task 1,task3 can take 80-90%CPU and 800 MB of RAM. All type of tasks task2 take only 1-2% of CPU and 5-10 MB of RAM How do i model this using Hadoop ? Can i use only one cluster for running all these type of tasks ? Shall I use different hadoop clusters for each tasktype , if yes, then how do i share data between these tasks (the data can be few MB to few GB) Please suggest or point to any docs which i can dig up. Thanks, Amit
