What you may be looking for is a workflow system such as Oozie (yahoo.github.com/oozie/) or Azkaban (http://sna-projects.com/azkaban/).
If your needs are simple (2-3 jobs, not too many conditions, etc. per workflow), you can checkout the JobControl API (http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/jobcontrol/package-summary.html) Hadoop offers to let you add dependent jobs and create uncomplicated dep-chains. P.s. Know that usually phases such as M-M-M-M can simply be M. If you want modularity in code to represent phases, checkout ChainMapper (http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/lib/ChainMapper.html). On Mon, Jul 25, 2011 at 11:50 PM, Ross Nordeen <rjnor...@mtu.edu> wrote: > > > Hello all, > > I am trying to write a MR program where the output from the mappers are > dependent on the previous map processes. I understand that a job scheduler > exists to control such processes. Would anyone be able to give some sample > code of a working implementation of this in hadoop 0.20.2? > > -- > Ross Nordeen > Computer Networking And Systems Administration > Michigan Technological University > http://www.linkedin.com/in/rjnordee > > -- Harsh J