Repository: oozie Updated Branches: refs/heads/master 89248dd26 -> 22b51b0b3
OOZIE-3377 [docs] Remaining 5.1.0 documentation changes (andras.piros) Project: http://git-wip-us.apache.org/repos/asf/oozie/repo Commit: http://git-wip-us.apache.org/repos/asf/oozie/commit/22b51b0b Tree: http://git-wip-us.apache.org/repos/asf/oozie/tree/22b51b0b Diff: http://git-wip-us.apache.org/repos/asf/oozie/diff/22b51b0b Branch: refs/heads/master Commit: 22b51b0b328b638f1873e92a91143cbdd8ec75a4 Parents: 89248dd Author: Andras Piros <[email protected]> Authored: Mon Nov 5 17:04:25 2018 +0100 Committer: Andras Piros <[email protected]> Committed: Mon Nov 5 17:04:25 2018 +0100 ---------------------------------------------------------------------- docs/src/site/markdown/DG_GitActionExtension.md | 159 +++++++++++++++++++ .../src/site/markdown/WorkflowFunctionalSpec.md | 119 -------------- docs/src/site/markdown/index.md | 2 + release-log.txt | 1 + 4 files changed, 162 insertions(+), 119 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/oozie/blob/22b51b0b/docs/src/site/markdown/DG_GitActionExtension.md ---------------------------------------------------------------------- diff --git a/docs/src/site/markdown/DG_GitActionExtension.md b/docs/src/site/markdown/DG_GitActionExtension.md new file mode 100644 index 0000000..3ec9494 --- /dev/null +++ b/docs/src/site/markdown/DG_GitActionExtension.md @@ -0,0 +1,159 @@ + + +[::Go back to Oozie Documentation Index::](index.html) + +----- + +# Oozie Git Action Extension + +<!-- MACRO{toc|fromDepth=1|toDepth=4} --> + +## Git Action + +The `git` action allows one to clone a Git repository into HDFS. The supported options are `git-uri`, `branch`, `key-path` +and `destination-uri`. + +The `git clone` action is executed asynchronously by one of the YARN containers assigned to run on the cluster. If an SSH key is +specified it will be created on the file system in a YARN container's local directory, relying on YARN NodeManager to remove the +file after the action has run. + +Path names specified in the `git` action should be able to be parameterized (templatized) using EL expressions, +e.g. `${wf:user()}` . Path name should be specified as an absolute path. Each file path must specify the file system URI. + +**Syntax:** + + +``` +<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="[NODE-NAME]"> + <git> + <git-uri>[SOURCE-URI]</git-uri> + ... + <branch>[BRANCH]</branch> + ... + <key-path>[HDFS-PATH]</key-path> + ... + <destination-uri>[HDFS-PATH]</destination-uri> + </git> + <ok to="[NODE-NAME]"/> + <error to="[NODE-NAME]"/> + </action> + ... +</workflow-app> +``` + +**Example:** + + +``` +<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> + ... + <action name="clone_oozie"> + <git> + <git-uri>https://github.com/apache/oozie</git-uri> + <destination-uri>hdfs://my_git_repo_directory</destination-uri> + </git> + <ok to="myotherjob"/> + <error to="errorcleanup"/> + </action> + ... +</workflow-app> +``` + +In the above example, a Git repository on e.g. GitHub.com is cloned to the HDFS directory `my_git_repo_directory` which should not +exist previously on the filesystem. Note that repository addresses outside of GitHub.com but accessible to the YARN container +running the Git action may also be used. + +If a `name-node` element is specified, then it is not necessary for any of the paths to start with the file system URI as it is +taken from the `name-node` element. + +The `resource-manager` (Oozie 5.x) element has to be specified to name the YARN ResourceManager address. + +If any of the paths need to be served from another HDFS namenode, its address has to be part of +that filesystem URI prefix: + +``` +<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="[NODE-NAME]"> + <git> + ... + <name-node>hdfs://name-node.first.company.com:8020</name-node> + ... + <key-path>hdfs://name-node.second.company.com:8020/[HDFS-PATH]</key-path> + ... + </git> + ... + </action> + ... +</workflow-app> +``` + +This is also true if the name-node is specified in the global section (see +[Global Configurations](WorkflowFunctionalSpec.html#GlobalConfigurations)). + +Be aware that `key-path` might point to a secure object store location other than the current `fs.defaultFS`. In that case, +appropriate file permissions are still necessary (readable by submitting user), credentials provided, etc. + +As of workflow schema 1.0, zero or more `job-xml` elements can be specified; these must refer to Hadoop JobConf `job.xml` formatted +files bundled in the workflow application. They can be used to set additional properties for the `FileSystem` instance. + +As of schema workflow schema 1.0, if a `configuration` element is specified, then it will also be used to set additional `JobConf` +properties for the `FileSystem` instance. Properties specified in the `configuration` element are overridden by properties +specified in the files specified by any `job-xml` elements. + +**Example:** + +``` +<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="[NODE-NAME]"> + <git> + ... + <name-node>hdfs://foo:8020</name-node> + <job-xml>fs-info.xml</job-xml> + <configuration> + <property> + <name>some.property</name> + <value>some.value</value> + </property> + </configuration> + </git> + ... + </action> + ... +</workflow> +``` + +## Appendix, Git XML-Schema + +### AE.A Appendix A, Git XML-Schema + +#### Git Action Schema Version 1.0 + +``` +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:git="uri:oozie:git-action:1.0" + elementFormDefault="qualified" + targetNamespace="uri:oozie:git-action:1.0"> + <xs:include schemaLocation="oozie-common-1.0.xsd"/> + <xs:element name="git" type="git:ACTION"/> + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="resource-manager" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="prepare" type="git:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="git-uri" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="branch" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="key-path" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="destination-uri" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="configuration" type="git:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + </xs:sequence> + </xs:complexType> +</xs:schema> +``` + +[::Go back to Oozie Documentation Index::](index.html) + + http://git-wip-us.apache.org/repos/asf/oozie/blob/22b51b0b/docs/src/site/markdown/WorkflowFunctionalSpec.md ---------------------------------------------------------------------- diff --git a/docs/src/site/markdown/WorkflowFunctionalSpec.md b/docs/src/site/markdown/WorkflowFunctionalSpec.md index 463635b..9576175 100644 --- a/docs/src/site/markdown/WorkflowFunctionalSpec.md +++ b/docs/src/site/markdown/WorkflowFunctionalSpec.md @@ -1656,125 +1656,6 @@ the main-class element has priority. existing Main to see how it works before creating your own. In fact, its probably simplest to just subclass the existing Main and add/modify/overwrite any behavior you want to change. -<a name="GitAction"></a> -#### 3.2.7 Git action - -The `git` action allows one to clone a Git repository into HDFS. The supported options are `git-uri`, `branch`, `key-path` -and `destination-uri`. - -The `git clone` action is executed asynchronously by one of the YARN containers assigned to run on the cluster. If an SSH key is -specified it will be created on the file system in a YARN container's local directory, relying on YARN NodeManager to remove the -file after the action has run. - -Path names specified in the `git` action should be able to be parameterized (templatized) using EL expressions, -e.g. `${wf:user()}` . Path name should be specified as an absolute path. Each file path must specify the file system URI. - -**Syntax:** - - -``` -<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> - ... - <action name="[NODE-NAME]"> - <git> - <git-uri>[SOURCE-URI]</git-uri> - ... - <branch>[BRANCH]</branch> - ... - <key-path>[HDFS-PATH]</key-path> - ... - <destination-uri>[HDFS-PATH]</destination-uri> - </git> - <ok to="[NODE-NAME]"/> - <error to="[NODE-NAME]"/> - </action> - ... -</workflow-app> -``` - -**Example:** - - -``` -<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> - ... - <action name="clone_oozie"> - <git> - <git-uri>https://github.com/apache/oozie</git-uri> - <destination-uri>hdfs://my_git_repo_directory</destination-uri> - </git> - <ok to="myotherjob"/> - <error to="errorcleanup"/> - </action> - ... -</workflow-app> -``` - -In the above example, a Git repository on e.g. GitHub.com is cloned to the HDFS directory `my_git_repo_directory` which should not -exist previously on the filesystem. Note that repository addresses outside of GitHub.com but accessible to the YARN container -running the Git action may also be used. - -If a `name-node` element is specified, then it is not necessary for any of the paths to start with the file system URI as it is -taken from the `name-node` element. - -The `resource-manager` (Oozie 5.x) element has to be specified to name the YARN ResourceManager address. - -If any of the paths need to be served from another HDFS namenode, its address has to be part of -that filesystem URI prefix: - -``` -<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> - ... - <action name="[NODE-NAME]"> - <git> - ... - <name-node>hdfs://name-node.first.company.com:8020</name-node> - ... - <key-path>hdfs://name-node.second.company.com:8020/[HDFS-PATH]</key-path> - ... - </git> - ... - </action> - ... -</workflow-app> -``` - -This is also true if the name-node is specified in the global section (see -[Global Configurations](WorkflowFunctionalSpec.html#GlobalConfigurations)). - -Be aware that `key-path` might point to a secure object store location other than the current `fs.defaultFS`. In that case, -appropriate file permissions are still necessary (readable by submitting user), credentials provided, etc. - -As of workflow schema 1.0, zero or more `job-xml` elements can be specified; these must refer to Hadoop JobConf `job.xml` formatted -files bundled in the workflow application. They can be used to set additional properties for the `FileSystem` instance. - -As of schema workflow schema 1.0, if a `configuration` element is specified, then it will also be used to set additional `JobConf` -properties for the `FileSystem` instance. Properties specified in the `configuration` element are overridden by properties -specified in the files specified by any `job-xml` elements. - -**Example:** - -``` -<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> - ... - <action name="[NODE-NAME]"> - <git> - ... - <name-node>hdfs://foo:8020</name-node> - <job-xml>fs-info.xml</job-xml> - <configuration> - <property> - <name>some.property</name> - <value>some.value</value> - </property> - </configuration> - </git> - ... - </action> - ... -</workflow> -``` - <a name="WorkflowParameterization"></a> ## 4 Parameterization of Workflows http://git-wip-us.apache.org/repos/asf/oozie/blob/22b51b0b/docs/src/site/markdown/index.md ---------------------------------------------------------------------- diff --git a/docs/src/site/markdown/index.md b/docs/src/site/markdown/index.md index 0216222..c8d4cf2 100644 --- a/docs/src/site/markdown/index.md +++ b/docs/src/site/markdown/index.md @@ -48,6 +48,7 @@ Enough reading already? Follow the steps in [Oozie Quick Start](DG_QuickStart.ht * [Oozie Core Javadocs](./core/apidocs/index.html) * [Oozie Web Services API](WebServicesAPI.html) * [Action Authentication](DG_ActionAuthentication.html) + * [Fluent Job API](DG_FluentJobAPI.html) ### Action Extensions @@ -59,6 +60,7 @@ Enough reading already? Follow the steps in [Oozie Quick Start](DG_QuickStart.ht * [Ssh Action](DG_SshActionExtension.html) * [DistCp Action](DG_DistCpActionExtension.html) * [Spark Action](DG_SparkActionExtension.html) + * [Git Action](DG_GitActionExtension.html) * [Writing a Custom Action Executor](DG_CustomActionExecutor.html) ### Job Status and SLA Monitoring http://git-wip-us.apache.org/repos/asf/oozie/blob/22b51b0b/release-log.txt ---------------------------------------------------------------------- diff --git a/release-log.txt b/release-log.txt index 70156a4..d8561d3 100644 --- a/release-log.txt +++ b/release-log.txt @@ -9,6 +9,7 @@ OOZIE-3277 [build] Check for star imports (kmarton via andras.piros) -- Oozie 5.1.0 release +OOZIE-3377 [docs] Remaining 5.1.0 documentation changes (andras.piros) OOZIE-3376 [tests] TestGraphGenerator should assume JDK8 minor version at least 1.8.0_u40 (andras.piros) OOZIE-3370 amend Property filtering is not consistent across job submission (andras.piros) OOZIE-3370 Property filtering is not consistent across job submission (andras.piros)
