Runping Qi wrote:
I have also been thinking of the Hadoop job scheduling issue too.
In my applications, some jobs depend on the outputs of other jobs.
Therefore, job dependency forms a DAG. A job is ready to run if and only if
it does not have any dependency or all the jobs it depends are finished
successfully. To help schedule and monitor a group of jobs like that, I am
thinking of implementing a utility class that:
        - accept jobs with dependency specification
      - monitor job status
      - submit jobs when they are ready

With such a utility class, the application can construct its jobs, specify
their dependency and then hand the jobs to the utility class. The utility
class takes care of the details of job submission.
A general solution would be sweet. Up to this, I've been using a bit of beanshell wrapper scheduling jobs. If jobs are interdependent, I'll have the wrapper script let out the exception and stop the processing else, catch the exception, log which job failed, and schedule a new one.

St.Ack

Reply via email to