[
https://issues.apache.org/jira/browse/HADOOP-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alejandro Abdelnur updated HADOOP-5303:
---------------------------------------
Summary: Oozie, Hadoop Workflow System (was: Hadoop Workflow System (HWS))
adding Oozie to the issue summary
> Oozie, Hadoop Workflow System
> -----------------------------
>
> Key: HADOOP-5303
> URL: https://issues.apache.org/jira/browse/HADOOP-5303
> Project: Hadoop Core
> Issue Type: New Feature
> Reporter: Alejandro Abdelnur
> Assignee: Alejandro Abdelnur
> Attachments: Hadoop_Summit_Oozie.pdf, hws-preso-v1_0_2009FEB22.pdf,
> hws-spec2009MAR09.pdf, hws-v1_0_2009FEB22.pdf,
> oozie-0.18.3.o0.1-SNAPSHOT-distro.tar.gz, oozie-spec-20090521.pdf,
> oozie-src-20090605.tar.gz
>
>
> This is a proposal for a system specialized in running Hadoop/Pig jobs in a
> control dependency DAG (Direct Acyclic Graph), a Hadoop workflow application.
> Attached there is a complete specification and a high level overview
> presentation.
> ----
> *Highlights*
> A Workflow application is DAG that coordinates the following types of
> actions: Hadoop, Pig, Ssh, Http, Email and sub-workflows.
> Flow control operations within the workflow applications can be done using
> decision, fork and join nodes. Cycles in workflows are not supported.
> Actions and decisions can be parameterized with job properties, actions
> output (i.e. Hadoop counters, Ssh key/value pairs output) and file
> information (file exists, file size, etc). Formal parameters are expressed in
> the workflow definition as {{${VAR}}} variables.
> A Workflow application is a ZIP file that contains the workflow definition
> (an XML file), all the necessary files to run all the actions: JAR files for
> Map/Reduce jobs, shells for streaming Map/Reduce jobs, native libraries, Pig
> scripts, and other resource files.
> Before running a workflow job, the corresponding workflow application must be
> deployed in HWS.
> Deploying workflow application and running workflow jobs can be done via
> command line tools, a WS API and a Java API.
> Monitoring the system and workflow jobs can be done via a web console,
> command line tools, a WS API and a Java API.
> When submitting a workflow job, a set of properties resolving all the formal
> parameters in the workflow definitions must be provided. This set of
> properties is a Hadoop configuration.
> Possible states for a workflow jobs are: {{CREATED}}, {{RUNNING}},
> {{SUSPENDED}}, {{SUCCEEDED}}, {{KILLED}} and {{FAILED}}.
> In the case of a action failure in a workflow job, depending on the type of
> failure, HWS will attempt automatic retries, it will request a manual retry
> or it will fail the workflow job.
> HWS can make HTTP callback notifications on action start/end/failure events
> and workflow end/failure events.
> In the case of workflow job failure, the workflow job can be resubmitted
> skipping previously completed actions. Before doing a resubmission the
> workflow application could be updated with a patch to fix a problem in the
> workflow application code.
> ----
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.