I am new to Hadoop (I have not yet installed/configured), and I want to make sure that I have the correct tool for the job. I do not "currently" have a need for the Map/Reduce functionality, but I am interested in using Hadoop for task orchestration, task monitoring, etc. over numerous nodes in a computing cluster. Our primary programs (written in C++ and launched via shell scripts) each run independantly on a single node, but are deployed to different nodes for load balancing. I want to task/initiate these processes on different nodes through a Java program located on a central server. I was hoping to use Hadoop as a foundation for this.
I read the following in the FAQ section: "How do I use Hadoop Streaming to run an arbitrary set of (semi-)independent tasks? Often you do not need the full power of Map Reduce, but only need to run multiple instances of the same program - either on different parts of the data, or on the same data, but with different parameters. You can use Hadoop Streaming to do this. " So, two questions I guess. 1. Can I use Hadoop for this purpose without using Map/Reduce functionality? 2. Are there any examples available on how to implement this sort of configuration? Any help would be greatly appreciated. Sam
