Hi, I'd like to get Hadoop running on a large University cluster which is used by many people to run different types of applications. We are currently using Torque to assign nodes and manage the queue. What I want to do is to enable people to request "n" processors, and automatically start Hadoop on those "n" nodes where one of the nodes becomes the name node and the rest become the workers. Once the job completes, the name node should shut down the cluster and terminate all instances of Hadoop processes.
I came across HOD (Hadoop On Demand), which seems to fit into my needs. However, the most up to date documentation I could find was for Hadoop 0.17.1 at: http://hadoop.apache.org/common/docs/r0.17.1/hod.html I'm wondering: 1- If I'm on the right track, ie. Can I use Hadoop for my needs or is there any better alternative? We can't setup a dedicated Hadoop cluster as there are other applications that need to be running on the same cluster nodes. 2- If there is a more recent version of HOD that is able to work with Hadoop 0.20.1 . Thanks in advance, Jim
