[ 
https://issues.apache.org/jira/browse/OOZIE-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511650#comment-16511650
 ] 

Robert Kanter commented on OOZIE-2339:
--------------------------------------

+1 LGTM.  Looks like the remaining comments [~pbacsko] is working on are 
largely cleanup.  So I think we're essentially done once that's all taken care 
of.  Great work!

I think it would be helpful to have some easy way to generate the XML from a 
Fluent Workflow as part of developing/debugging it.  For instance, maybe some 
kind of {{main}} class that takes a {{WorkflowFactory}} and will output the XML 
to a file or stdout.  Something that can easily be run when writing a 
{{WorkflowFactory}} in and IDE without having to assemble a JAR.  That would be 
helpful for users more familiar with the XML or are trying out the Fluent API 
and want to get an idea of what it looks like before submitting it.  This could 
be a followup JIRA if you agree.

> [fluent-job] Minimum Viable Fluent Job API
> ------------------------------------------
>
>                 Key: OOZIE-2339
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2339
>             Project: Oozie
>          Issue Type: New Feature
>          Components: client, core, examples, fluent-job, tests
>    Affects Versions: 4.3.0
>            Reporter: Robert Kanter
>            Assignee: Andras Piros
>            Priority: Major
>             Fix For: 5.1.0
>
>         Attachments: OOZIE-2339.001.patch, OOZIE-2339.002.patch, 
> OOZIE-2339.003.patch, OOZIE-2339.004.patch, OOZIE-2339.005.patch, 
> OOZIE-2339.006.patch, OOZIE-2339.008.patch, OOZIE-2339.010.patch, 
> OOZIE-2339.011.patch, OOZIE-2339.012.patch, OOZIE-2339.013.patch, 
> OOZIE-2339.014.patch, OOZIE-2339.015.patch, OOZIE-2339.016.patch, 
> OOZIE-2339.017.patch, OOZIE-2339.018.patch, OOZIE-2339.019.patch
>
>
> Users often complain about the XML they have to write for Oozie jobs.  It 
> would be nice if they could write them in something like Java, but we don't 
> want to have to maintain a separate Java API for this.  I was looking around 
> and saw that JAXB might be the right thing here.  From what I can tell, it 
> lets you create Java classes from XSD schemas.  So, we should be able to 
> auto-generate a Java API for writing Oozie jobs, without having to really 
> maintain it.
> We should investigate if this is feasible and, if so, implement it.
> Some useful looking links:
> * [JAXB 
> overview|https://en.wikipedia.org/wiki/Java_Architecture_for_XML_Binding]
> * [JAXB description|https://jaxb.java.net/2.2.11/docs/ch03.html]
> * [Maven JAXB plugin|https://java.net/projects/maven-jaxb2-plugin/pages/Home]
> * [Apache Falcon|https://falcon.apache.org]
> Key features:
> * must have:
> ** inside a {{fluent-job-api}} artifact
> ** able to create workflow / coordinator / bundle definitions programmatically
> ** synchronizing each and every XSD change on rebuild
> ** can write {{workflow.xml}}, {{coordinator.xml}}, {{bundle.xml}}, and 
> {{jobs.properties}} artifacts of every XSD version
> ** cloneability of workflow etc. {{Object}} s
> ** perform cross checks, e.g. that the workflow graph is a DAG
> ** only latest XSD versions should be supported as must have
> * nice to have:
> ** XSD version(s) can be provided. When not provided, latest ones are 
> considered as valid
> ** implement a [*fluent API*|https://en.wikipedia.org/wiki/Fluent_interface]
> ** have a Python / Jython / Py4J REPL to make it easy to experiment with also 
> for data engineers / data scientists
> ** create documentation about usage
> ** can read {{workflow.xml}}, {{coordinator.xml}}, {{bundle.xml}}, and 
> {{jobs.properties}} artifacts of every XSD version
> ** can convert between XSD versions
> ** support XSD change on the fly (within REPL)
> ** support HDFS reads / writes
> ** support dry run on an Oozie server to perform checks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to