Added: oozie/site/trunk/content/resources/docs/5.0.0/DG_SparkActionExtension.html URL: http://svn.apache.org/viewvc/oozie/site/trunk/content/resources/docs/5.0.0/DG_SparkActionExtension.html?rev=1828722&view=auto ============================================================================== --- oozie/site/trunk/content/resources/docs/5.0.0/DG_SparkActionExtension.html (added) +++ oozie/site/trunk/content/resources/docs/5.0.0/DG_SparkActionExtension.html Mon Apr 9 14:12:36 2018 @@ -0,0 +1,564 @@ +<!DOCTYPE html> +<!-- + | Generated by Apache Maven Doxia at Apr 9, 2018 + | Rendered using Apache Maven Fluido Skin 1.4 +--> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta charset="UTF-8" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta http-equiv="Content-Language" content="en" /> + <title>Oozie - </title> + <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" /> + <link rel="stylesheet" href="./css/site.css" /> + <link rel="stylesheet" href="./css/print.css" media="print" /> + + + <script type="text/javascript" src="./js/apache-maven-fluido-1.4.min.js"></script> + + + </head> + <body class="topBarDisabled"> + + + + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"> + <a href="https://oozie.apache.org/" id="bannerLeft"> + <img src="https://oozie.apache.org/images/oozie_200x.png" alt="Oozie"/> + </a> + </div> + <div class="pull-right"> </div> + <div class="clear"><hr/></div> + </div> + + <div id="breadcrumbs"> + <ul class="breadcrumb"> + + + <li class=""> + <a href="../../" title="Apache"> + Apache</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../../" title="Oozie"> + Oozie</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../" title="docs"> + docs</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="./" title="5.0.0"> + 5.0.0</a> + <span class="divider">/</span> + </li> + <li class="active ">Oozie - </li> + + + + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-04-09</li> + <li id="projectVersion" class="pull-right"> + Version: 5.0.0 + </li> + + </ul> + </div> + + + <div class="row-fluid"> + <div id="leftColumn" class="span2"> + <div class="well sidebar-nav"> + + + <ul class="nav nav-list"> + </ul> + + + + <hr /> + + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> + <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> + </a> + </div> + </div> + </div> + + + <div id="bodyColumn" class="span10" > + + <p></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<hr /> +<a name="Oozie_Spark_Action_Extension"></a> +<div class="section"><h2> Oozie Spark Action Extension</h2> +<p><ul><ul><li><a href="#Spark_Action">Spark Action</a> +<ul><li><a href="#Spark_Action_Logging">Spark Action Logging</a> +</li> +<li><a href="#Spark_on_YARN">Spark on YARN</a> +</li> +<li><a href="#PySpark_with_Spark_Action">PySpark with Spark Action</a> +</li> +<li><a href="#Using_Symlink_in_jar">Using Symlink in <jar></a> +</li> +</ul> +</li> +<li><a href="#Appendix_Spark_XML-Schema">Appendix, Spark XML-Schema</a> +<ul><li><a href="#AE.A_Appendix_A_Spark_XML-Schema">AE.A Appendix A, Spark XML-Schema</a> +<ul><li><a href="#Spark_Action_Schema_Version_1.0">Spark Action Schema Version 1.0</a> +</li> +<li><a href="#Spark_Action_Schema_Version_0.2">Spark Action Schema Version 0.2</a> +</li> +<li><a href="#Spark_Action_Schema_Version_0.1">Spark Action Schema Version 0.1</a> +</li> +</ul> +</li> +</ul> +</li> +</ul> +</ul> +</p> +<a name="Spark_Action"></a> +<div class="section"><h3>Spark Action</h3> +<p>The <tt>spark</tt> + action runs a Spark job.</p> +<p>The workflow job will wait until the Spark job completes before +continuing to the next action.</p> +<p>To run the Spark job, you have to configure the <tt>spark</tt> + action with +the <tt>resource-manager</tt> +, <tt>name-node</tt> +, Spark <tt>master</tt> + elements as +well as the necessary elements, arguments and configuration.</p> +<p>Spark options can be specified in an element called <tt>spark-opts</tt> +.</p> +<p>A <tt>spark</tt> + action can be configured to create or delete HDFS directories +before starting the Spark job.</p> +<p>Oozie EL expressions can be used in the inline configuration. Property +values specified in the <tt>configuration</tt> + element override values specified +in the <tt>job-xml</tt> + file.</p> +<p><b>Syntax:</b> +</p> +<p><pre> +<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="[NODE-NAME]"> + <spark xmlns="uri:oozie:spark-action:1.0"> + <resource-manager>[RESOURCE-MANAGER]</resource-manager> + <name-node>[NAME-NODE]</name-node> + <prepare> + <delete path="[PATH]"/> + ... + <mkdir path="[PATH]"/> + ... + </prepare> + <job-xml>[SPARK SETTINGS FILE]</job-xml> + <configuration> + <property> + <name>[PROPERTY-NAME]</name> + <value>[PROPERTY-VALUE]</value> + </property> + ... + </configuration> + <master>[SPARK MASTER URL]</master> + <mode>[SPARK MODE]</mode> + <name>[SPARK JOB NAME]</name> + <class>[SPARK MAIN CLASS]</class> + <jar>[SPARK DEPENDENCIES JAR / PYTHON FILE]</jar> + <spark-opts>[SPARK-OPTIONS]</spark-opts> + <arg>[ARG-VALUE]</arg> + ... + <arg>[ARG-VALUE]</arg> + ... + </spark> + <ok to="[NODE-NAME]"/> + <error to="[NODE-NAME]"/> + </action> + ... +</workflow-app> +</pre></p> +<p>The <tt>prepare</tt> + element, if present, indicates a list of paths to delete +or create before starting the job. Specified paths must start with <tt><a href="./hdfs://HOST:PORT.html">hdfs://HOST:PORT</a> +</tt> +.</p> +<p>The <tt>job-xml</tt> + element, if present, specifies a file containing configuration +for the Spark job. Multiple <tt>job-xml</tt> + elements are allowed in order to +specify multiple <tt>job.xml</tt> + files.</p> +<p>The <tt>configuration</tt> + element, if present, contains configuration +properties that are passed to the Spark job.</p> +<p>The <tt>master</tt> + element indicates the url of the Spark Master. Ex: <a href="./spark://host:port,.html">spark://host:port,</a> + <a href="./mesos://host:port,.html">mesos://host:port,</a> + yarn-cluster, yarn-client, +or local.</p> +<p>The <tt>mode</tt> + element if present indicates the mode of spark, where to run spark driver program. Ex: client,cluster. This is typically +not required because you can specify it as part of <tt>master</tt> + (i.e. master=yarn, mode=client is equivalent to master=yarn-client). +A local <tt>master</tt> + always runs in client mode.</p> +<p>Depending on the <tt>master</tt> + (and <tt>mode</tt> +) entered, the Spark job will run differently as follows:<ul><li>local mode: everything runs here in the Launcher Job.</li> +<li>yarn-client mode: the driver runs here in the Launcher Job and the executor in Yarn.</li> +<li>yarn-cluster mode: the driver and executor run in Yarn.</li> +</ul> +</p> +<p>The <tt>name</tt> + element indicates the name of the spark application.</p> +<p>The <tt>class</tt> + element if present, indicates the spark's application main class.</p> +<p>The <tt>jar</tt> + element indicates a comma separated list of jars or python files.</p> +<p>The <tt>spark-opts</tt> + element, if present, contains a list of Spark options that can be passed to Spark. Spark configuration +options can be passed by specifying '--conf key=value' or other Spark CLI options. +Values containing whitespaces can be enclosed by double quotes.</p> +<p>Some examples of the <tt>spark-opts</tt> + element:<ul><li>'--conf key=value'</li> +<li>'--conf key1=value1 value2'</li> +<li>'--conf key1="value1 value2"'</li> +<li>'--conf key1=value1 key2="value2 value3"'</li> +<li>'--conf key=value --verbose --properties-file user.properties'</li> +</ul> +</p> +<p>There are several ways to define properties that will be passed to Spark. They are processed in the following order: + * propagated from <tt>oozie.service.SparkConfigurationService.spark.configurations</tt> + + * read from a localized <tt>spark-defaults.conf</tt> + file + * read from a file defined in <tt>spark-opts</tt> + via the <tt>--properties-file</tt> + + * properties defined in <tt>spark-opts</tt> + element</p> +<p>(The latter takes precedence over the former.) +The server propagated properties, the <tt>spark-defaults.conf</tt> + and the user-defined properties file are merged together into a +single properties file as Spark handles only one file in its <tt>--properties-file</tt> + option.</p> +<p>The <tt>arg</tt> + element if present, contains arguments that can be passed to spark application.</p> +<p>All the above elements can be parameterized (templatized) using EL +expressions.</p> +<p><b>Example:</b> +</p> +<p><pre> +<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="myfirstsparkjob"> + <spark xmlns="uri:oozie:spark-action:1.0"> + <resource-manager>foo:8032</resource-manager> + <name-node>bar:8020</name-node> + <prepare> + <delete path="${jobOutput}"/> + </prepare> + <configuration> + <property> + <name>mapred.compress.map.output</name> + <value>true</value> + </property> + </configuration> + <master>local[*]</master> + <mode>client</mode> + <name>Spark Example</name> + <class>org.apache.spark.examples.mllib.JavaALS</class> + <jar>/lib/spark-examples_2.10-1.1.0.jar</jar> + <spark-opts>--executor-memory 20G --num-executors 50 + --conf spark.executor.extraJavaOptions="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp"</spark-opts> + <arg>inputpath=hdfs://localhost/input/file.txt</arg> + <arg>value=2</arg> + </spark> + <ok to="myotherjob"/> + <error to="errorcleanup"/> + </action> + ... +</workflow-app> +</pre></p> +<a name="Spark_Action_Logging"></a> +<div class="section"><h4>Spark Action Logging</h4> +<p>Spark action logs are redirected to the Oozie Launcher map-reduce job task STDOUT/STDERR that runs Spark.</p> +<p>From Oozie web-console, from the Spark action pop up using the 'Console URL' link, it is possible +to navigate to the Oozie Launcher map-reduce job task logs via the Hadoop job-tracker web-console.</p> +<a name="Spark_on_YARN"></a> +</div> +<div class="section"><h4>Spark on YARN</h4> +<p>To ensure that your Spark job shows up in the Spark History Server, make sure to specify these three Spark configuration properties +either in <tt>spark-opts</tt> + with <tt>--conf</tt> + or from <tt>oozie.service.SparkConfigurationService.spark.configurations</tt> + in oozie-site.xml.</p> +<p>1. spark.yarn.historyServer.address=SPH-HOST:18088</p> +<p>2. spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory</p> +<p>3. spark.eventLog.enabled=true</p> +<a name="PySpark_with_Spark_Action"></a> +</div> +<div class="section"><h4>PySpark with Spark Action</h4> +<p>To submit PySpark scripts with Spark Action, pyspark dependencies must be available in sharelib or in workflow's lib/ directory. +For more information, please refer to <a href="./AG_Install.html#Oozie_Share_Lib">installation document.</a> +</p> +<p><b>Example:</b> +</p> +<p><pre> +<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:1.0"> + .... + <action name="myfirstpysparkjob"> + <spark xmlns="uri:oozie:spark-action:1.0"> + <resource-manager>foo:8032</resource-manager> + <name-node>bar:8020</name-node> + <prepare> + <delete path="${jobOutput}"/> + </prepare> + <configuration> + <property> + <name>mapred.compress.map.output</name> + <value>true</value> + </property> + </configuration> + <master>yarn-cluster</master> + <name>Spark Example</name> + <jar>pi.py</jar> + <spark-opts>--executor-memory 20G --num-executors 50 + --conf spark.executor.extraJavaOptions="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp"</spark-opts> + <arg>100</arg> + </spark> + <ok to="myotherjob"/> + <error to="errorcleanup"/> + </action> + ... +</workflow-app> +</pre></p> +<p>The <tt>jar</tt> + element indicates python file. Refer to the file by it's localized name, because only local files are allowed +in PySpark. The py file should be in the lib/ folder next to the workflow.xml or added using the <tt>file</tt> + element so that +it's localized to the working directory with just its name.</p> +<a name="Using_Symlink_in_jar"></a> +</div> +<div class="section"><h4>Using Symlink in <jar></h4> +<p>A symlink must be specified using <tt><a href="./WorkflowFunctionalSpec.html#a3.2.2. +1_Adding_Files_and_Archives_for_the_Job">file</a> +</tt> + element. Then, you can use +the symlink name in <tt>jar</tt> + element.</p> +<p><b>Example:</b> +</p> +<p>Specifying relative path for symlink:</p> +<p>Make sure that the file is within the application directory i.e. <tt>oozie.wf.application.path</tt> + . +<pre> + <spark xmlns="uri:oozie:spark-action:1.0"> + ... + <jar>py-spark-example-symlink.py</jar> + ... + ... + <file>py-spark.py#py-spark-example-symlink.py</file> + ... + </spark> +</pre></p> +<p>Specifying full path for symlink: +<pre> + <spark xmlns="uri:oozie:spark-action:1.0"> + ... + <jar>spark-example-symlink.jar</jar> + ... + ... + <file>hdfs://localhost:8020/user/testjars/all-oozie-examples.jar#spark-example-symlink.jar</file> + ... + </spark> +</pre></p> +<a name="Appendix_Spark_XML-Schema"></a> +</div> +</div> +<div class="section"><h3>Appendix, Spark XML-Schema</h3> +<a name="AE.A_Appendix_A_Spark_XML-Schema"></a> +<div class="section"><h4>AE.A Appendix A, Spark XML-Schema</h4> +<a name="Spark_Action_Schema_Version_1.0"></a> +<div class="section"><h5>Spark Action Schema Version 1.0</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:spark="uri:oozie:spark-action:1.0" elementFormDefault="qualified" + targetNamespace="uri:oozie:spark-action:1.0"> +. + <xs:include schemaLocation="oozie-common-1.0.xsd"/> +. + <xs:element name="spark" type="spark:ACTION"/> +. + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:choice> + <xs:element name="job-tracker" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="resource-manager" type="xs:string" minOccurs="0" maxOccurs="1"/> + </xs:choice> + <xs:element name="name-node" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="prepare" type="spark:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="launcher" type="spark:LAUNCHER" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="configuration" type="spark:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:element name="master" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="mode" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="name" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="class" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="jar" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="spark-opts" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="arg" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> +. +</xs:schema> +</pre></p> +<a name="Spark_Action_Schema_Version_0.2"></a> +</div> +<div class="section"><h5>Spark Action Schema Version 0.2</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:spark="uri:oozie:spark-action:0.2" elementFormDefault="qualified" + targetNamespace="uri:oozie:spark-action:0.2"> <xs:element name="spark" type="spark:ACTION"/> + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="job-tracker" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="name-node" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="prepare" type="spark:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="configuration" type="spark:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:element name="master" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="mode" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="name" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="class" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="jar" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="spark-opts" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="arg" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="CONFIGURATION"> + <xs:sequence> + <xs:element name="property" minOccurs="1" maxOccurs="unbounded"> + <xs:complexType> + <xs:sequence> + <xs:element name="name" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="value" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="description" minOccurs="0" maxOccurs="1" type="xs:string"/> + </xs:sequence> + </xs:complexType> + </xs:element> + </xs:sequence> + </xs:complexType> + <xs:complexType name="PREPARE"> + <xs:sequence> + <xs:element name="delete" type="spark:DELETE" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="mkdir" type="spark:MKDIR" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="DELETE"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> + <xs:complexType name="MKDIR"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> +</xs:schema> +</pre> +</p> +<a name="Spark_Action_Schema_Version_0.1"></a> +</div> +<div class="section"><h5>Spark Action Schema Version 0.1</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:spark="uri:oozie:spark-action:0.1" elementFormDefault="qualified" + targetNamespace="uri:oozie:spark-action:0.1"> <xs:element name="spark" type="spark:ACTION"/> + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="prepare" type="spark:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="configuration" type="spark:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:element name="master" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="mode" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="name" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="class" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="jar" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="spark-opts" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="arg" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="CONFIGURATION"> + <xs:sequence> + <xs:element name="property" minOccurs="1" maxOccurs="unbounded"> + <xs:complexType> + <xs:sequence> + <xs:element name="name" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="value" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="description" minOccurs="0" maxOccurs="1" type="xs:string"/> + </xs:sequence> + </xs:complexType> + </xs:element> + </xs:sequence> + </xs:complexType> + <xs:complexType name="PREPARE"> + <xs:sequence> + <xs:element name="delete" type="spark:DELETE" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="mkdir" type="spark:MKDIR" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="DELETE"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> + <xs:complexType name="MKDIR"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> +</xs:schema> +</pre> +<a href="./index.html">::Go back to Oozie Documentation Index::</a> + +</p> +<p></p> +</div> +</div> +</div> + + </div> + </div> + </div> + + <hr/> + + <footer> + <div class="container-fluid"> + <div class="row-fluid"> + <p >Copyright © 2018 + <a href="http://www.apache.org">Apache Software Foundation</a>. + All rights reserved. + + </p> + </div> + + + </div> + </footer> + </body> +</html>
Added: oozie/site/trunk/content/resources/docs/5.0.0/DG_SqoopActionExtension.html URL: http://svn.apache.org/viewvc/oozie/site/trunk/content/resources/docs/5.0.0/DG_SqoopActionExtension.html?rev=1828722&view=auto ============================================================================== --- oozie/site/trunk/content/resources/docs/5.0.0/DG_SqoopActionExtension.html (added) +++ oozie/site/trunk/content/resources/docs/5.0.0/DG_SqoopActionExtension.html Mon Apr 9 14:12:36 2018 @@ -0,0 +1,488 @@ +<!DOCTYPE html> +<!-- + | Generated by Apache Maven Doxia at Apr 9, 2018 + | Rendered using Apache Maven Fluido Skin 1.4 +--> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta charset="UTF-8" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta http-equiv="Content-Language" content="en" /> + <title>Oozie - </title> + <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" /> + <link rel="stylesheet" href="./css/site.css" /> + <link rel="stylesheet" href="./css/print.css" media="print" /> + + + <script type="text/javascript" src="./js/apache-maven-fluido-1.4.min.js"></script> + + + </head> + <body class="topBarDisabled"> + + + + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"> + <a href="https://oozie.apache.org/" id="bannerLeft"> + <img src="https://oozie.apache.org/images/oozie_200x.png" alt="Oozie"/> + </a> + </div> + <div class="pull-right"> </div> + <div class="clear"><hr/></div> + </div> + + <div id="breadcrumbs"> + <ul class="breadcrumb"> + + + <li class=""> + <a href="../../" title="Apache"> + Apache</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../../" title="Oozie"> + Oozie</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../" title="docs"> + docs</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="./" title="5.0.0"> + 5.0.0</a> + <span class="divider">/</span> + </li> + <li class="active ">Oozie - </li> + + + + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-04-09</li> + <li id="projectVersion" class="pull-right"> + Version: 5.0.0 + </li> + + </ul> + </div> + + + <div class="row-fluid"> + <div id="leftColumn" class="span2"> + <div class="well sidebar-nav"> + + + <ul class="nav nav-list"> + </ul> + + + + <hr /> + + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> + <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> + </a> + </div> + </div> + </div> + + + <div id="bodyColumn" class="span10" > + + <p></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<hr /> +<a name="Oozie_Sqoop_Action_Extension"></a> +<div class="section"><h2> Oozie Sqoop Action Extension</h2> +<p><ul><ul><li><a href="#Sqoop_Action">Sqoop Action</a> +<ul><li><a href="#Sqoop_Action_Counters">Sqoop Action Counters</a> +</li> +<li><a href="#Sqoop_Action_Logging">Sqoop Action Logging</a> +</li> +</ul> +</li> +<li><a href="#Appendix_Sqoop_XML-Schema">Appendix, Sqoop XML-Schema</a> +<ul><li><a href="#AE.A_Appendix_A_Sqoop_XML-Schema">AE.A Appendix A, Sqoop XML-Schema</a> +<ul><li><a href="#Sqoop_Action_Schema_Version_1.0">Sqoop Action Schema Version 1.0</a> +</li> +<li><a href="#Sqoop_Action_Schema_Version_0.3">Sqoop Action Schema Version 0.3</a> +</li> +<li><a href="#Sqoop_Action_Schema_Version_0.2">Sqoop Action Schema Version 0.2</a> +</li> +</ul> +</li> +</ul> +</li> +</ul> +</ul> +</p> +<a name="Sqoop_Action"></a> +<div class="section"><h3>Sqoop Action</h3> +<p><b>IMPORTANT:</b> + The Sqoop action requires Apache Hadoop 1.x or 2.x.</p> +<p>The <tt>sqoop</tt> + action runs a Sqoop job.</p> +<p>The workflow job will wait until the Sqoop job completes before +continuing to the next action.</p> +<p>To run the Sqoop job, you have to configure the <tt>sqoop</tt> + action with the <tt>resource-manager</tt> +, <tt>name-node</tt> + and Sqoop <tt>command</tt> + +or <tt>arg</tt> + elements as well as configuration.</p> +<p>A <tt>sqoop</tt> + action can be configured to create or delete HDFS directories +before starting the Sqoop job.</p> +<p>Sqoop configuration can be specified with a file, using the <tt>job-xml</tt> + +element, and inline, using the <tt>configuration</tt> + elements.</p> +<p>Oozie EL expressions can be used in the inline configuration. Property +values specified in the <tt>configuration</tt> + element override values specified +in the <tt>job-xml</tt> + file.</p> +<p>Note that YARN <tt>yarn.resourcemanager.address</tt> + / <tt>resource-manager</tt> + and HDFS <tt>fs.default.name</tt> + / <tt>name-node</tt> + properties must not +be present in the inline configuration.</p> +<p>As with Hadoop <tt>map-reduce</tt> + jobs, it is possible to add files and +archives in order to make them available to the Sqoop job. Refer to the +[WorkflowFunctionalSpec#FilesArchives][Adding Files and Archives for the Job] +section for more information about this feature.</p> +<p><b>Syntax:</b> +</p> +<p><pre> +<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="[NODE-NAME]"> + <sqoop xmlns="uri:oozie:sqoop-action:1.0"> + <resource-manager>[RESOURCE-MANAGER]</resource-manager> + <name-node>[NAME-NODE]</name-node> + <prepare> + <delete path="[PATH]"/> + ... + <mkdir path="[PATH]"/> + ... + </prepare> + <configuration> + <property> + <name>[PROPERTY-NAME]</name> + <value>[PROPERTY-VALUE]</value> + </property> + ... + </configuration> + <command>[SQOOP-COMMAND]</command> + <arg>[SQOOP-ARGUMENT]</arg> + ... + <file>[FILE-PATH]</file> + ... + <archive>[FILE-PATH]</archive> + ... + </sqoop> + <ok to="[NODE-NAME]"/> + <error to="[NODE-NAME]"/> + </action> + ... +</workflow-app> +</pre></p> +<p>The <tt>prepare</tt> + element, if present, indicates a list of paths to delete +or create before starting the job. Specified paths must start with <tt><a href="./hdfs://HOST:PORT.html">hdfs://HOST:PORT</a> +</tt> +.</p> +<p>The <tt>job-xml</tt> + element, if present, specifies a file containing configuration +for the Sqoop job. As of schema 0.3, multiple <tt>job-xml</tt> + elements are allowed in order to +specify multiple <tt>job.xml</tt> + files.</p> +<p>The <tt>configuration</tt> + element, if present, contains configuration +properties that are passed to the Sqoop job.</p> +<p><b>Sqoop command</b> +</p> +<p>The Sqoop command can be specified either using the <tt>command</tt> + element or multiple <tt>arg</tt> + +elements.</p> +<p>When using the <tt>command</tt> + element, Oozie will split the command on every space +into multiple arguments.</p> +<p>When using the <tt>arg</tt> + elements, Oozie will pass each argument value as an argument to Sqoop.</p> +<p>The <tt>arg</tt> + variant should be used when there are spaces within a single argument.</p> +<p>Consult the Sqoop documentation for a complete list of valid Sqoop commands.</p> +<p>All the above elements can be parameterized (templatized) using EL +expressions.</p> +<p><b>Examples:</b> +</p> +<p>Using the <tt>command</tt> + element:</p> +<p><pre> +<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="myfirsthivejob"> + <sqoop xmlns="uri:oozie:sqoop-action:1.0"> + <resource-manager>foo:8032</resource-manager> + <name-node>bar:8020</name-node> + <prepare> + <delete path="${jobOutput}"/> + </prepare> + <configuration> + <property> + <name>mapred.compress.map.output</name> + <value>true</value> + </property> + </configuration> + <command>import --connect jdbc:hsqldb:file:db.hsqldb --table TT --target-dir hdfs://localhost:8020/user/tucu/foo -m 1</command> + </sqoop> + <ok to="myotherjob"/> + <error to="errorcleanup"/> + </action> + ... +</workflow-app> +</pre></p> +<p>The same Sqoop action using <tt>arg</tt> + elements:</p> +<p><pre> +<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="myfirstsqoopjob"> + <sqoop xmlns="uri:oozie:sqoop-action:1.0"> + <resource-manager>foo:8032</resource-manager> + <name-node>bar:8020</name-node> + <prepare> + <delete path="${jobOutput}"/> + </prepare> + <configuration> + <property> + <name>mapred.compress.map.output</name> + <value>true</value> + </property> + </configuration> + <arg>import</arg> + <arg>--connect</arg> + <arg>jdbc:hsqldb:file:db.hsqldb</arg> + <arg>--table</arg> + <arg>TT</arg> + <arg>--target-dir</arg> + <arg>hdfs://localhost:8020/user/tucu/foo</arg> + <arg>-m</arg> + <arg>1</arg> + </sqoop> + <ok to="myotherjob"/> + <error to="errorcleanup"/> + </action> + ... +</workflow-app> +</pre></p> +<p>NOTE: The <tt>arg</tt> + elements syntax, while more verbose, allows to have spaces in a single argument, something useful when +using free from queries.</p> +<a name="Sqoop_Action_Counters"></a> +<div class="section"><h4>Sqoop Action Counters</h4> +<p>The counters of the map-reduce job run by the Sqoop action are available to be used in the workflow via the +<a href="./WorkflowFunctionalSpec.html#HadoopCountersEL">hadoop:counters() EL function</a> +.</p> +<p>If the Sqoop action run an import all command, the <tt>hadoop:counters()</tt> + EL will return the aggregated counters +of all map-reduce jobs run by the Sqoop import all command.</p> +<a name="Sqoop_Action_Logging"></a> +</div> +<div class="section"><h4>Sqoop Action Logging</h4> +<p>Sqoop action logs are redirected to the Oozie Launcher map-reduce job task STDOUT/STDERR that runs Sqoop.</p> +<p>From Oozie web-console, from the Sqoop action pop up using the 'Console URL' link, it is possible +to navigate to the Oozie Launcher map-reduce job task logs via the Hadoop job-tracker web-console.</p> +<p>The logging level of the Sqoop action can set in the Sqoop action configuration using the +property <tt>oozie.sqoop.log.level</tt> +. The default value is <tt>INFO</tt> +.</p> +<a name="Appendix_Sqoop_XML-Schema"></a> +</div> +</div> +<div class="section"><h3>Appendix, Sqoop XML-Schema</h3> +<a name="AE.A_Appendix_A_Sqoop_XML-Schema"></a> +<div class="section"><h4>AE.A Appendix A, Sqoop XML-Schema</h4> +<a name="Sqoop_Action_Schema_Version_1.0"></a> +<div class="section"><h5>Sqoop Action Schema Version 1.0</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:sqoop="uri:oozie:sqoop-action:1.0" + elementFormDefault="qualified" + targetNamespace="uri:oozie:sqoop-action:1.0"> +. + <xs:include schemaLocation="oozie-common-1.0.xsd"/> +. + <xs:element name="sqoop" type="sqoop:ACTION"/> +. + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:choice> + <xs:element name="job-tracker" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="resource-manager" type="xs:string" minOccurs="0" maxOccurs="1"/> + </xs:choice> + <xs:element name="name-node" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="prepare" type="sqoop:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="launcher" type="sqoop:LAUNCHER" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="configuration" type="sqoop:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:choice> + <xs:element name="command" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="arg" type="xs:string" minOccurs="1" maxOccurs="unbounded"/> + </xs:choice> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> +. +</xs:schema> +</pre></p> +<a name="Sqoop_Action_Schema_Version_0.3"></a> +</div> +<div class="section"><h5>Sqoop Action Schema Version 0.3</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:sqoop="uri:oozie:sqoop-action:0.3" elementFormDefault="qualified" + targetNamespace="uri:oozie:sqoop-action:0.3"> <xs:element name="sqoop" type="sqoop:ACTION"/> + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="prepare" type="sqoop:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="configuration" type="sqoop:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:choice> + <xs:element name="command" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="arg" type="xs:string" minOccurs="1" maxOccurs="unbounded"/> + </xs:choice> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="CONFIGURATION"> + <xs:sequence> + <xs:element name="property" minOccurs="1" maxOccurs="unbounded"> + <xs:complexType> + <xs:sequence> + <xs:element name="name" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="value" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="description" minOccurs="0" maxOccurs="1" type="xs:string"/> + </xs:sequence> + </xs:complexType> + </xs:element> + </xs:sequence> + </xs:complexType> + <xs:complexType name="PREPARE"> + <xs:sequence> + <xs:element name="delete" type="sqoop:DELETE" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="mkdir" type="sqoop:MKDIR" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="DELETE"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> + <xs:complexType name="MKDIR"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> +</xs:schema> +</pre> +</p> +<a name="Sqoop_Action_Schema_Version_0.2"></a> +</div> +<div class="section"><h5>Sqoop Action Schema Version 0.2</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:sqoop="uri:oozie:sqoop-action:0.2" elementFormDefault="qualified" + targetNamespace="uri:oozie:sqoop-action:0.2"> <xs:element name="sqoop" type="sqoop:ACTION"/> +. + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="prepare" type="sqoop:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="configuration" type="sqoop:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:choice> + <xs:element name="command" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="arg" type="xs:string" minOccurs="1" maxOccurs="unbounded"/> + </xs:choice> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> +. + <xs:complexType name="CONFIGURATION"> + <xs:sequence> + <xs:element name="property" minOccurs="1" maxOccurs="unbounded"> + <xs:complexType> + <xs:sequence> + <xs:element name="name" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="value" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="description" minOccurs="0" maxOccurs="1" type="xs:string"/> + </xs:sequence> + </xs:complexType> + </xs:element> + </xs:sequence> + </xs:complexType> +. + <xs:complexType name="PREPARE"> + <xs:sequence> + <xs:element name="delete" type="sqoop:DELETE" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="mkdir" type="sqoop:MKDIR" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> +. + <xs:complexType name="DELETE"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> +. + <xs:complexType name="MKDIR"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> +. +</xs:schema> +</pre> +</p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<p></p> +</div> +</div> +</div> + + </div> + </div> + </div> + + <hr/> + + <footer> + <div class="container-fluid"> + <div class="row-fluid"> + <p >Copyright © 2018 + <a href="http://www.apache.org">Apache Software Foundation</a>. + All rights reserved. + + </p> + </div> + + + </div> + </footer> + </body> +</html> Added: oozie/site/trunk/content/resources/docs/5.0.0/DG_SshActionExtension.html URL: http://svn.apache.org/viewvc/oozie/site/trunk/content/resources/docs/5.0.0/DG_SshActionExtension.html?rev=1828722&view=auto ============================================================================== --- oozie/site/trunk/content/resources/docs/5.0.0/DG_SshActionExtension.html (added) +++ oozie/site/trunk/content/resources/docs/5.0.0/DG_SshActionExtension.html Mon Apr 9 14:12:36 2018 @@ -0,0 +1,295 @@ +<!DOCTYPE html> +<!-- + | Generated by Apache Maven Doxia at Apr 9, 2018 + | Rendered using Apache Maven Fluido Skin 1.4 +--> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta charset="UTF-8" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta http-equiv="Content-Language" content="en" /> + <title>Oozie - </title> + <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" /> + <link rel="stylesheet" href="./css/site.css" /> + <link rel="stylesheet" href="./css/print.css" media="print" /> + + + <script type="text/javascript" src="./js/apache-maven-fluido-1.4.min.js"></script> + + + </head> + <body class="topBarDisabled"> + + + + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"> + <a href="https://oozie.apache.org/" id="bannerLeft"> + <img src="https://oozie.apache.org/images/oozie_200x.png" alt="Oozie"/> + </a> + </div> + <div class="pull-right"> </div> + <div class="clear"><hr/></div> + </div> + + <div id="breadcrumbs"> + <ul class="breadcrumb"> + + + <li class=""> + <a href="../../" title="Apache"> + Apache</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../../" title="Oozie"> + Oozie</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../" title="docs"> + docs</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="./" title="5.0.0"> + 5.0.0</a> + <span class="divider">/</span> + </li> + <li class="active ">Oozie - </li> + + + + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-04-09</li> + <li id="projectVersion" class="pull-right"> + Version: 5.0.0 + </li> + + </ul> + </div> + + + <div class="row-fluid"> + <div id="leftColumn" class="span2"> + <div class="well sidebar-nav"> + + + <ul class="nav nav-list"> + </ul> + + + + <hr /> + + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> + <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> + </a> + </div> + </div> + </div> + + + <div id="bodyColumn" class="span10" > + + <p></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<hr /> +<a name="Oozie_Ssh_Action_Extension"></a> +<div class="section"><h2> Oozie Ssh Action Extension</h2> +<p><ul><ul><li><a href="#Ssh_Action">Ssh Action</a> +</li> +<li><a href="#Appendix_Ssh_XML-Schema">Appendix, Ssh XML-Schema</a> +<ul><li><a href="#AE.A_Appendix_A_Ssh_XML-Schema">AE.A Appendix A, Ssh XML-Schema</a> +<ul><li><a href="#Ssh_Action_Schema_Version_0.2">Ssh Action Schema Version 0.2</a> +</li> +<li><a href="#Ssh_Action_Schema_Version_0.1">Ssh Action Schema Version 0.1</a> +</li> +</ul> +</li> +</ul> +</li> +</ul> +</ul> +</p> +<a name="Ssh_Action"></a> +<div class="section"><h3>Ssh Action</h3> +<p>The <tt>ssh</tt> + action starts a shell command on a remote machine as a remote secure shell in background. The workflow job +will wait until the remote shell command completes before continuing to the next action.</p> +<p>The shell command must be present in the remote machine and it must be available for execution via the command path.</p> +<p>The shell command is executed in the home directory of the specified user in the remote host.</p> +<p>The output (STDOUT) of the ssh job can be made available to the workflow job after the ssh job ends. This information +could be used from within decision nodes. If the output of the ssh job is made available to the workflow job the shell +command must follow the following requirements:</p> +<p><ul><li>The format of the output must be a valid Java Properties file.</li> +<li>The size of the output must not exceed 2KB.</li> +</ul> +</p> +<p>Note: Ssh Action will fail if any output is written to standard error / output upon login (e.g. .bashrc of the remote +user contains ls -a).</p> +<p><b>Syntax:</b> +</p> +<p><pre> +<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="[NODE-NAME]"> + <ssh xmlns="uri:oozie:ssh-action:0.1"> + <host>[USER]@[HOST]</host> + <command>[SHELL]</command> + <args>[ARGUMENTS]</args> + ... + <capture-output/> + </ssh> + <ok to="[NODE-NAME]"/> + <error to="[NODE-NAME]"/> + </action> + ... +</workflow-app> +</pre></p> +<p>The <tt>host</tt> + indicates the user and host where the shell will be executed.</p> +<p><b>IMPORTANT:</b> + The <tt>oozie.action.ssh.allow.user.at.host</tt> + property, in the <tt>oozie-site.xml</tt> + configuration, indicates if +an alternate user than the one submitting the job can be used for the ssh invocation. By default this property is set +to <tt>true</tt> +.</p> +<p>The <tt>command</tt> + element indicates the shell command to execute.</p> +<p>The <tt>args</tt> + element, if present, contains parameters to be passed to the shell command. If more than one <tt>args</tt> + element +is present they are concatenated in order. When an <tt>args</tt> + element contains a space, even when quoted, it will be considered as +separate arguments (i.e. "Hello World" becomes "Hello" and "World"). Starting with ssh schema 0.2, you can use the <tt>arg</tt> + element +(note that this is different than the <tt>args</tt> + element) to specify arguments that have a space in them (i.e. "Hello World" is +preserved as "Hello World"). You can use either <tt>args</tt> + elements, <tt>arg</tt> + elements, or neither; but not both in the same action.</p> +<p>If the <tt>capture-output</tt> + element is present, it indicates Oozie to capture output of the STDOUT of the ssh command +execution. The ssh command output must be in Java Properties file format and it must not exceed 2KB. From within the +workflow definition, the output of an ssh action node is accessible via the <tt>String action:output(String node, +String key)</tt> + function (Refer to section '4.2.6 Action EL Functions').</p> +<p>The configuration of the <tt>ssh</tt> + action can be parameterized (templatized) using EL expressions.</p> +<p><b>Example:</b> +</p> +<p><pre> +<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="myssjob"> + <ssh xmlns="uri:oozie:ssh-action:0.1"> + <host>f...@bar.com<host> + <command>uploaddata</command> + <args>jdbc:derby://bar.com:1527/myDB</args> + <args>hdfs://foobar.com:8020/usr/tucu/myData</args> + </ssh> + <ok to="myotherjob"/> + <error to="errorcleanup"/> + </action> + ... +</workflow-app> +</pre></p> +<p>In the above example, the <tt>uploaddata</tt> + shell command is executed with two arguments, <tt>jdbc:derby://foo.com:1527/myDB</tt> + +and <tt><a href="./hdfs://foobar.com:8020/usr/tucu/myData.html">hdfs://foobar.com:8020/usr/tucu/myData</a> +</tt> +.</p> +<p>The <tt>uploaddata</tt> + shell must be available in the remote host and available in the command path.</p> +<p>The output of the command will be ignored because the <tt>capture-output</tt> + element is not present.</p> +<a name="Appendix_Ssh_XML-Schema"></a> +</div> +<div class="section"><h3>Appendix, Ssh XML-Schema</h3> +<a name="AE.A_Appendix_A_Ssh_XML-Schema"></a> +<div class="section"><h4>AE.A Appendix A, Ssh XML-Schema</h4> +<a name="Ssh_Action_Schema_Version_0.2"></a> +<div class="section"><h5>Ssh Action Schema Version 0.2</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:ssh="uri:oozie:ssh-action:0.2" elementFormDefault="qualified" + targetNamespace="uri:oozie:ssh-action:0.2"> +. + <xs:element name="ssh" type="ssh:ACTION"/> +. + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="host" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="command" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:choice> + <xs:element name="args" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="arg" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + </xs:choice> + <xs:element name="capture-output" type="ssh:FLAG" minOccurs="0" maxOccurs="1"/> + </xs:sequence> + </xs:complexType> +. + <xs:complexType name="FLAG"/> +. +</xs:schema> +</pre></p> +<a name="Ssh_Action_Schema_Version_0.1"></a> +</div> +<div class="section"><h5>Ssh Action Schema Version 0.1</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:ssh="uri:oozie:ssh-action:0.1" elementFormDefault="qualified" + targetNamespace="uri:oozie:ssh-action:0.1"> +. + <xs:element name="ssh" type="ssh:ACTION"/> +. + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="host" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="command" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="args" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="capture-output" type="ssh:FLAG" minOccurs="0" maxOccurs="1"/> + </xs:sequence> + </xs:complexType> +. + <xs:complexType name="FLAG"/> +. +</xs:schema> +</pre></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<p></p> +</div> +</div> +</div> + + </div> + </div> + </div> + + <hr/> + + <footer> + <div class="container-fluid"> + <div class="row-fluid"> + <p >Copyright © 2018 + <a href="http://www.apache.org">Apache Software Foundation</a>. + All rights reserved. + + </p> + </div> + + + </div> + </footer> + </body> +</html> Added: oozie/site/trunk/content/resources/docs/5.0.0/DG_WorkflowReRun.html URL: http://svn.apache.org/viewvc/oozie/site/trunk/content/resources/docs/5.0.0/DG_WorkflowReRun.html?rev=1828722&view=auto ============================================================================== --- oozie/site/trunk/content/resources/docs/5.0.0/DG_WorkflowReRun.html (added) +++ oozie/site/trunk/content/resources/docs/5.0.0/DG_WorkflowReRun.html Mon Apr 9 14:12:36 2018 @@ -0,0 +1,178 @@ +<!DOCTYPE html> +<!-- + | Generated by Apache Maven Doxia at Apr 9, 2018 + | Rendered using Apache Maven Fluido Skin 1.4 +--> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta charset="UTF-8" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta http-equiv="Content-Language" content="en" /> + <title>Oozie - </title> + <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" /> + <link rel="stylesheet" href="./css/site.css" /> + <link rel="stylesheet" href="./css/print.css" media="print" /> + + + <script type="text/javascript" src="./js/apache-maven-fluido-1.4.min.js"></script> + + + </head> + <body class="topBarDisabled"> + + + + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"> + <a href="https://oozie.apache.org/" id="bannerLeft"> + <img src="https://oozie.apache.org/images/oozie_200x.png" alt="Oozie"/> + </a> + </div> + <div class="pull-right"> </div> + <div class="clear"><hr/></div> + </div> + + <div id="breadcrumbs"> + <ul class="breadcrumb"> + + + <li class=""> + <a href="../../" title="Apache"> + Apache</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../../" title="Oozie"> + Oozie</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../" title="docs"> + docs</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="./" title="5.0.0"> + 5.0.0</a> + <span class="divider">/</span> + </li> + <li class="active ">Oozie - </li> + + + + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-04-09</li> + <li id="projectVersion" class="pull-right"> + Version: 5.0.0 + </li> + + </ul> + </div> + + + <div class="row-fluid"> + <div id="leftColumn" class="span2"> + <div class="well sidebar-nav"> + + + <ul class="nav nav-list"> + </ul> + + + + <hr /> + + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> + <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> + </a> + </div> + </div> + </div> + + + <div id="bodyColumn" class="span10" > + + <p></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<a name="Workflow_ReRrun"></a> +<div class="section"><h2> Workflow ReRrun</h2> +<p><ul><ul><li><a href="#Configs">Configs</a> +</li> +<li><a href="#Pre-Conditions">Pre-Conditions</a> +</li> +<li><a href="#ReRun">ReRun</a> +</li> +</ul> +</ul> +</p> +<a name="Configs"></a> +<div class="section"><h3>Configs</h3> +<p><ul><li>oozie.wf.application.path</li> +</ul> +* Only one of following two configurations is mandatory. Both should not be defined at the same time<ul><li><ul><li>oozie.wf.rerun.skip.nodes</li> +<li>oozie.wf.rerun.failnodes</li> +</ul> +</li> +<li>Skip nodes are comma separated list of action names. They can be any action nodes including decision node.</li> +<li>The valid value of <tt>oozie.wf.rerun.failnodes</tt> + is true or false.</li> +<li>If secured hadoop version is used, the following two properties needs to be specified as well<ul><li>mapreduce.jobtracker.kerberos.principal</li> +<li>dfs.namenode.kerberos.principal.</li> +</ul> +</li> +<li>Configurations can be passed as -D param.</li> +</ul> +<pre> +$ oozie job -oozie http://localhost:11000/oozie -rerun 14-20090525161321-oozie-joe -Doozie.wf.rerun.skip.nodes=<> +</pre></p> +<a name="Pre-Conditions"></a> +</div> +<div class="section"><h3>Pre-Conditions</h3> +<p><ul><li>Workflow with id wfId should exist.</li> +<li>Workflow with id wfId should be in SUCCEEDED/KILLED/FAILED.</li> +<li>If specified , nodes in the config oozie.wf.rerun.skip.nodes must be completed successfully.</li> +</ul> +</p> +<a name="ReRun"></a> +</div> +<div class="section"><h3>ReRun</h3> +<p><ul><li>Reloads the configs.</li> +<li>If no configuration is passed, existing coordinator/workflow configuration will be used. If configuration is passed then, it will be merged with existing workflow configuration. Input configuration will take the precedence.</li> +<li>Currently there is no way to remove an existing configuration but only override by passing a different value in the input configuration.</li> +<li>Creates a new Workflow Instance with the same wfId.</li> +<li>Deletes the actions that are not skipped from the DB and copies data from old Workflow Instance to new one for skipped actions.</li> +<li>Action handler will skip the nodes given in the config with the same exit transition as before.</li> +</ul> +</p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<p></p> +</div> + + </div> + </div> + </div> + + <hr/> + + <footer> + <div class="container-fluid"> + <div class="row-fluid"> + <p >Copyright © 2018 + <a href="http://www.apache.org">Apache Software Foundation</a>. + All rights reserved. + + </p> + </div> + + + </div> + </footer> + </body> +</html> Added: oozie/site/trunk/content/resources/docs/5.0.0/ENG_Building.html URL: http://svn.apache.org/viewvc/oozie/site/trunk/content/resources/docs/5.0.0/ENG_Building.html?rev=1828722&view=auto ============================================================================== --- oozie/site/trunk/content/resources/docs/5.0.0/ENG_Building.html (added) +++ oozie/site/trunk/content/resources/docs/5.0.0/ENG_Building.html Mon Apr 9 14:12:36 2018 @@ -0,0 +1,427 @@ +<!DOCTYPE html> +<!-- + | Generated by Apache Maven Doxia at Apr 9, 2018 + | Rendered using Apache Maven Fluido Skin 1.4 +--> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta charset="UTF-8" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta http-equiv="Content-Language" content="en" /> + <title>Oozie - </title> + <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" /> + <link rel="stylesheet" href="./css/site.css" /> + <link rel="stylesheet" href="./css/print.css" media="print" /> + + + <script type="text/javascript" src="./js/apache-maven-fluido-1.4.min.js"></script> + + + </head> + <body class="topBarDisabled"> + + + + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"> + <a href="https://oozie.apache.org/" id="bannerLeft"> + <img src="https://oozie.apache.org/images/oozie_200x.png" alt="Oozie"/> + </a> + </div> + <div class="pull-right"> </div> + <div class="clear"><hr/></div> + </div> + + <div id="breadcrumbs"> + <ul class="breadcrumb"> + + + <li class=""> + <a href="../../" title="Apache"> + Apache</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../../" title="Oozie"> + Oozie</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../" title="docs"> + docs</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="./" title="5.0.0"> + 5.0.0</a> + <span class="divider">/</span> + </li> + <li class="active ">Oozie - </li> + + + + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-04-09</li> + <li id="projectVersion" class="pull-right"> + Version: 5.0.0 + </li> + + </ul> + </div> + + + <div class="row-fluid"> + <div id="leftColumn" class="span2"> + <div class="well sidebar-nav"> + + + <ul class="nav nav-list"> + </ul> + + + + <hr /> + + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> + <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> + </a> + </div> + </div> + </div> + + + <div id="bodyColumn" class="span10" > + + <p></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<a name="Building_Oozie"></a> +<div class="section"><h2> Building Oozie</h2> +<p><ul><ul><li><a href="#System_Requirements">System Requirements</a> +</li> +<li><a href="#Oozie_Documentation_Generation">Oozie Documentation Generation</a> +</li> +<li><a href="#Passphrase-less_SSH_Setup">Passphrase-less SSH Setup</a> +</li> +<li><a href="#Building_with_different_Java_Versions">Building with different Java Versions</a> +</li> +<li><a href="#Building_and_Testing_Oozie">Building and Testing Oozie</a> +<ul><li><a href="#Examples_Running_Oozie_Testcases_with_Different_Configurations">Examples Running Oozie Testcases with Different Configurations</a> +</li> +<li><a href="#Build_Options_Reference">Build Options Reference</a> +</li> +<li><a href="#Testing_Map_Reduce_Pipes_Action">Testing Map Reduce Pipes Action</a> +</li> +</ul> +</li> +<li><a href="#Building_an_Oozie_Distribution">Building an Oozie Distribution</a> +</li> +<li><a href="#IDE_Setup">IDE Setup</a> +</li> +</ul> +</ul> +</p> +<a name="System_Requirements"></a> +<div class="section"><h3>System Requirements</h3> +<p><ul><li>Unix box (tested on Mac OS X and Linux)</li> +<li>Java JDK 1.8+</li> +<li><a class="externalLink" href="http://maven.apache.org/">Maven 3.0.1+</a> +</li> +<li><a class="externalLink" href="http://hadoop.apache.org/core/releases.html">Hadoop 2.6.0+</a> +</li> +<li><a class="externalLink" href="http://hadoop.apache.org/pig/releases.html">Pig 0.10.1+</a> +</li> +</ul> +</p> +<p>JDK commands (java, javac) must be in the command path.</p> +<p>The Maven command (mvn) must be in the command path.</p> +<a name="Oozie_Documentation_Generation"></a> +</div> +<div class="section"><h3>Oozie Documentation Generation</h3> +<p>To generate the documentation, Oozie uses a patched Doxia plugin for Maven with improved twiki support.</p> +<p>The source of the modified plugin is available in the Oozie GitHub repository, in the <tt>ydoxia</tt> + branch.</p> +<p>To build and install it locally run the following command in the <tt>ydoxia</tt> + branch:</p> +<p><pre> +$ mvn install +</pre></p> +<p><a name="SshSetup"></a> +</p> +<a name="Passphrase-less_SSH_Setup"></a> +</div> +<div class="section"><h3>Passphrase-less SSH Setup</h3> +<p><b>NOTE: SSH actions are deprecated in Oozie 2.</b> +</p> +<p>To run SSH Testcases and for easier Hadoop start/stop configure SSH to localhost to be passphrase-less.</p> +<p>Create your SSH keys without a passphrase and add the public key to the authorized file:</p> +<p><pre> +$ ssh-keygen -t dsa +$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys2 +</pre></p> +<p>Test that you can ssh without password:</p> +<p><pre> +$ ssh localhost +</pre></p> +<a name="Building_with_different_Java_Versions"></a> +</div> +<div class="section"><h3>Building with different Java Versions</h3> +<p>Oozie requires a minimum Java version of 1.8. Any newer version can be used but by default bytecode will be generated +which is compatible with 1.8. This can be changed by specifying the build property <b>targetJavaVersion</b> +.</p> +<a name="Building_and_Testing_Oozie"></a> +</div> +<div class="section"><h3>Building and Testing Oozie</h3> +<p>The JARs for the specified Hadoop and Pig versions must be available in one of the Maven repositories defined in Oozie +main 'pom.xml' file. Or they must be installed in the local Maven cache.</p> +<a name="Examples_Running_Oozie_Testcases_with_Different_Configurations"></a> +<div class="section"><h4>Examples Running Oozie Testcases with Different Configurations</h4> +<p><b>Using embedded Hadoop minicluster with 'simple' authentication:</b> +</p> +<p><pre> +$ mvn clean test +</pre></p> +<p><b>Using a Hadoop cluster with 'simple' authentication:</b> +</p> +<p><pre> +$ mvn clean test -Doozie.test.hadoop.minicluster=false +</pre></p> +<p><b>Using embedded Hadoop minicluster with 'simple' authentication and Derby database:</b> +</p> +<p><pre> +$ mvn clean test -Doozie.test.hadoop.minicluster=false -Doozie.test.db=derby +</pre></p> +<p><b>Using a Hadoop cluster with 'kerberos' authentication:</b> +</p> +<p><pre> +$ mvn clean test -Doozie.test.hadoop.minicluster=false -Doozie.test.hadoop.security=kerberos +</pre></p> +<p>NOTE: The embedded minicluster cannot be used when testing with 'kerberos' authentication.</p> +<p><b>Using a custom Oozie configuration for testcases:</b> +</p> +<p><pre> +$ mvn clean test -Doozie.test.config.file=/home/tucu/custom-oozie-sitel.xml +</pre></p> +<p><b>Running the testcases with different databases:</b> +</p> +<p><pre> +$ mvn clean test -Doozie.test.db=[hsqldb*|derby|mysql|postgres|oracle] +</pre></p> +<p>Using <tt>mysql</tt> + and <tt>oracle</tt> + enables profiles that will include their JARs files in the build. If using + <tt>oracle</tt> +, the Oracle JDBC JAR file must be manually installed in the local Maven cache (the JAR is +not available in public Maven repos).</p> +<a name="Build_Options_Reference"></a> +</div> +<div class="section"><h4>Build Options Reference</h4> +<p>All these options can be set using <b>-D</b> +.</p> +<p>Except for the options marked with <tt>(*)</tt> +, the options can be specified in the <tt>test.properties</tt> + in the root +of the Oozie project. The options marked with <tt>(*)</tt> + are used in Maven POMs, thus they don't take effect if +specified in the <tt>test.properties</tt> + file (which is loaded by the <tt>XTestCase</tt> + class at class initialization time).</p> +<p><b>hadoop.version</b> + <tt>(*)</tt> +: indicates the Hadoop version you wish to build Oozie against specifically. It will +substitute this value in the Oozie POM properties and pull the corresponding Hadoop artifacts from Maven. +The default version is 2.6.0 and that is the minimum supported Hadoop version.</p> +<p><b>generateSite</b> + (*): generates Oozie documentation, default is undefined (no documentation is generated)</p> +<p><b>skipTests</b> + (*): skips the execution of all testcases, no value required, default is undefined</p> +<p><b>test</b> += (*): runs a single test case, to run a test give the test class name without package and extension, no default</p> +<p><b>oozie.test.db</b> +<tt> (*): indicates the database to use for running the testcases, supported values are 'hsqldb', 'derby', + 'mysql', 'postgres' and 'oracle'; default value is 'hsqldb'. For each database there is + =core/src/test/resources/DATABASE-oozie-site.xml</tt> + file preconfigured.</p> +<p><b>oozie.test.properties</b> + (*): indicates the file to load the test properties from, by default is <tt>test.properties</tt> +. +Having this option allows having different test properties sets, for example: minicluster, simple & kerberos.</p> +<p><b>oozie.test.waitfor.ratio</b> +<tt> : multiplication factor for testcases using waitfor, the ratio is used to adjust the +effective time out. For slow machines the ratio should be increased. The default value is =1</tt> +.</p> +<p><b>oozie.test.config.file</b> += : indicates a custom Oozie configuration file for running the testcases. The specified file +must be an absolute path. For example, it can be useful to specify different database than HSQL for running the +testcases.</p> +<p><b>oozie.test.hadoop.minicluster</b> += : indicates if Hadoop minicluster should be started for testcases, default value 'true'</p> +<p><b>oozie.test.job.tracker</b> += : indicates the URI of the JobTracker when using a Hadoop cluster for testing, default value +'localhost:8021'</p> +<p><b>oozie.test.name.node</b> += : indicates the URI of the NameNode when using a Hadoop cluster for testing, default value +'hdfs://localhost:8020'</p> +<p><b>oozie.test.hadoop.security</b> += : indicates the type of Hadoop authentication for testing, valid values are 'simple' or +'kerberos, default value 'simple'</p> +<p><b>oozie.test.kerberos.keytab.file</b> += : indicates the location of the keytab file, default value +'${user.home}/oozie.keytab'</p> +<p><b>oozie.test.kerberos.realm</b> += : indicates the Kerberos real, default value 'LOCALHOST'</p> +<p><b>oozie.test.kerberos.oozie.principal</b> += : indicates the Kerberos principal for oozie, default value +'${user.name}/localhost'</p> +<p><b>oozie.test.kerberos.jobtracker.principal</b> += : indicates the Kerberos principal for the JobTracker, default value +'mapred/localhost'</p> +<p><b>oozie.test.kerberos.namenode.principal</b> += : indicates the Kerberos principal for the NameNode, default value +'hdfs/localhost'</p> +<p><b>oozie.test.user.oozie</b> +<tt> : specifies the user ID used to start Oozie server in testcases, default value +is =${user.name}</tt> +.</p> +<p><b>oozie.test.user.test</b> +<tt> : specifies primary user ID used as the user submitting jobs to Oozie Server in testcases, +default value is =test</tt> +.</p> +<p><b>oozie.test.user.test2</b> +<tt> : specifies secondary user ID used as the user submitting jobs to Oozie Server in testcases, +default value is =test2</tt> +.</p> +<p><b>oozie.test.user.test3</b> +<tt> : specifies secondary user ID used as the user submitting jobs to Oozie Server in testcases, +default value is =test3</tt> +.</p> +<p><b>oozie.test.group</b> +<tt> : specifies group ID used as group when submitting jobs to Oozie Server in testcases, +default value is =testg</tt> +.</p> +<p>NOTE: The users/group specified in <b>oozie.test.user.test2</b> +, <b>oozie.test.user.test3</b> += and <b>oozie.test.user.group</b> += +are used for the authorization testcases only.</p> +<p><b>oozie.test.dir</b> +<tt> : specifies the directory where the =oozietests</tt> + directory will be created, default value is <tt>/tmp</tt> +. +The <tt>oozietests</tt> + directory is used by testcases when they need a local filesystem directory.</p> +<p><b>hadoop.log.dir</b> +<tt> : specifies the directory where Hadoop minicluster will write its logs during testcases, default +value is =/tmp</tt> +.</p> +<p><b>test.exclude</b> +<tt> : specifies a testcase class (just the class name) to exclude for the tests run, for example =TestSubmitCommand</tt> +.</p> +<p><b>test.exclude.pattern</b> +<tt> : specifies one or more patterns for testcases to exclude, for example =**/Test*Command.java</tt> +.</p> +<a name="Testing_Map_Reduce_Pipes_Action"></a> +</div> +<div class="section"><h4>Testing Map Reduce Pipes Action</h4> +<p>Pipes testcases require Hadoop's <b>wordcount-simple</b> + pipes binary example to run. The <b>wordcount-simple</b> + pipes binary +should be compiled for the build platform and copied into Oozie's <b>core/src/test/resources/</b> + directory. The binary file +must be named <b>wordcount-simple</b> +.</p> +<p>If the <b>wordcount-simple</b> + pipes binary file is not available the testcase will do a NOP and it will print to its output +file the following message 'SKIPPING TEST: TestPipesMain, binary 'wordcount-simple' not available in the classpath'.</p> +<p>There are 2 testcases that use the <b>wordcount-simple</b> + pipes binary, <b>TestPipesMain</b> + and <b>TestMapReduceActionExecutor</b> +, +the 'SKIPPING TEST..." message would appear in the testcase log file of both testcases.</p> +<a name="Building_an_Oozie_Distribution"></a> +</div> +</div> +<div class="section"><h3>Building an Oozie Distribution</h3> +<p>An Oozie distribution bundles an embedded Jetty server.</p> +<p>The simplest way to build Oozie is to run the <tt>mkdistro.sh</tt> + script: +<pre> +$ bin/mkdistro.sh [-DskipTests] +Running <tt>mkdistro.sh</tt> + will create the binary distribution of Oozie. The following options are available to customise +the versions of the dependencies: +-Puber - Bundle required hadoop and hcatalog libraries in oozie war +-Dhadoop.version=<version> - default 2.6.0 +-Ptez - Bundle tez jars in hive and pig sharelibs. Useful if you want to use tez +as the execution engine for those applications. +-Dpig.version=<version> - default 0.16.0 +-Dpig.classifier=<classifier> - default h2 +-Dsqoop.version=<version> - default 1.4.3 +-Dsqoop.classifier=<classifier> - default hadoop100 +-jetty.version=<version> - default 9.2.19.v20160908 +-Dopenjpa.version=<version> - default 2.2.2 +-Dxerces.version=<version> - default 2.10.0 +-Dcurator.version=<version> - default 2.5.0 +-Dhive.version=<version> - default 1.2.0 +-Dhbase.version=<version> - default 1.2.3 +-Dtez.version=<version> - default 0.8.4 +</pre></p> +<p>The following properties should be specified when building a release:</p> +<p><ul><li>-DgenerateDocs : forces the generation of Oozie documentation</li> +<li>-Dbuild.time= : timestamps the distribution</li> +<li>-Dvc.revision= : specifies the source control revision number of the distribution</li> +<li>-Dvc.url= : specifies the source control URL of the distribution</li> +</ul> +</p> +<p>The provided <tt>bin/mkdistro.sh</tt> + script runs the above Maven invocation setting all these properties to the +right values (the 'vc.*' properties are obtained from the local git repository).</p> +<a name="IDE_Setup"></a> +</div> +<div class="section"><h3>IDE Setup</h3> +<p>Eclipse and IntelliJ can use directly Oozie Maven project files.</p> +<p>The only special consideration is that the following source directories from the <tt>client</tt> + module must be added to +the <tt>core</tt> + module source path:</p> +<p><ul><li><tt>client/src/main/java</tt> + : as source directory</li> +<li><tt>client/src/main/resources</tt> + : as source directory</li> +<li><tt>client/src/test/java</tt> + : as test-source directory</li> +<li><tt>client/src/test/resources</tt> + : as test-source directory</li> +</ul> +</p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<p></p> +</div> + + </div> + </div> + </div> + + <hr/> + + <footer> + <div class="container-fluid"> + <div class="row-fluid"> + <p >Copyright © 2018 + <a href="http://www.apache.org">Apache Software Foundation</a>. + All rights reserved. + + </p> + </div> + + + </div> + </footer> + </body> +</html>