[ 
https://issues.apache.org/jira/browse/HADOOP-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676694#action_12676694
 ] 

Alejandro Abdelnur commented on HADOOP-5303:
--------------------------------------------

h4. Regarding language independence for launching WF jobs,

Yes, that is one of the motivations of the WS API.

h4. Regarding HWS being server-side,

Yes, to provide scalability, failover, monitoring and manageability. We want to 
move away from client-side solutions.

With many 1000s WF jobs per day, a client-side solution running a process per 
WF job is not a feasible option.

h4. Regarding leveraging existing workflow engines,

Definitely, that has been our approach from the beginning, our current 
implementation uses jBPM (which uses Hibernate for persistency). 

jBPM & Hibernate are LGPL. Apache has provisions to work with LGPL, though this 
complicates things a bit (for testing, integration, bundling, and potentially 
some extra layer of indirection).

Once there is a agreement in the Hadoop community regarding the functional spec 
we'd discuss about specifics of the implementation.

We can easily replace jBPM with another implementation, and we are doing some 
work on this area.

h4. Regarding naming corrections,

"hadoop" action to "map-reduce" action and "hdfs" to "fs", makes sense.

h4. Regarding HDFS generalization,

As we use FileSystem API, it's there already, it just a name thing and per 
previous comment we could rename 'hdfs' to 'fs'.

h4. Regarding querying a WF job progress,

Yes, you can get the current status of the WF job, stats about completed and 
running actions: start time, end time, resolved configuration values (with all 
EL expressions evaluated), current status, exit transition, link to the 
webconsole for the action if any (ie Hadoop job webconsole link), etc.



> Hadoop Workflow System (HWS)
> ----------------------------
>
>                 Key: HADOOP-5303
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5303
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: hws-preso-v1_0_2009FEB22.pdf, hws-v1_0_2009FEB22.pdf
>
>
> This is a proposal for a system specialized in running Hadoop/Pig jobs in a 
> control dependency DAG (Direct Acyclic Graph), a Hadoop workflow application.
> Attached there is a complete specification and a high level overview 
> presentation.
> ----
> *Highlights* 
> A Workflow application is DAG that coordinates the following types of 
> actions: Hadoop, Pig, Ssh, Http, Email and sub-workflows. 
> Flow control operations within the workflow applications can be done using 
> decision, fork and join nodes. Cycles in workflows are not supported.
> Actions and decisions can be parameterized with job properties, actions 
> output (i.e. Hadoop counters, Ssh key/value pairs output) and file 
> information (file exists, file size, etc). Formal parameters are expressed in 
> the workflow definition as {{${VAR}}} variables.
> A Workflow application is a ZIP file that contains the workflow definition 
> (an XML file), all the necessary files to run all the actions: JAR files for 
> Map/Reduce jobs, shells for streaming Map/Reduce jobs, native libraries, Pig 
> scripts, and other resource files.
> Before running a workflow job, the corresponding workflow application must be 
> deployed in HWS.
> Deploying workflow application and running workflow jobs can be done via 
> command line tools, a WS API and a Java API.
> Monitoring the system and workflow jobs can be done via a web console, 
> command line tools, a WS API and a Java API.
> When submitting a workflow job, a set of properties resolving all the formal 
> parameters in the workflow definitions must be provided. This set of 
> properties is a Hadoop configuration.
> Possible states for a workflow jobs are: {{CREATED}}, {{RUNNING}}, 
> {{SUSPENDED}}, {{SUCCEEDED}}, {{KILLED}} and {{FAILED}}.
> In the case of a action failure in a workflow job, depending on the type of 
> failure, HWS will attempt automatic retries, it will request a manual retry 
> or it will fail the workflow job.
> HWS can make HTTP callback notifications on action start/end/failure events 
> and workflow end/failure events.
> In the case of workflow job failure, the workflow job can be resubmitted 
> skipping previously completed actions. Before doing a resubmission the 
> workflow application could be updated with a patch to fix a problem in the 
> workflow application code.
> ----

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to