Added: websites/staging/oozie/trunk/content/docs/5.0.0/DG_QuickStart.html ============================================================================== --- websites/staging/oozie/trunk/content/docs/5.0.0/DG_QuickStart.html (added) +++ websites/staging/oozie/trunk/content/docs/5.0.0/DG_QuickStart.html Mon Apr 9 14:26:49 2018 @@ -0,0 +1,390 @@ +<!DOCTYPE html> +<!-- + | Generated by Apache Maven Doxia at Apr 9, 2018 + | Rendered using Apache Maven Fluido Skin 1.4 +--> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta charset="UTF-8" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta http-equiv="Content-Language" content="en" /> + <title>Oozie - </title> + <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" /> + <link rel="stylesheet" href="./css/site.css" /> + <link rel="stylesheet" href="./css/print.css" media="print" /> + + + <script type="text/javascript" src="./js/apache-maven-fluido-1.4.min.js"></script> + + + </head> + <body class="topBarDisabled"> + + + + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"> + <a href="https://oozie.apache.org/" id="bannerLeft"> + <img src="https://oozie.apache.org/images/oozie_200x.png" alt="Oozie"/> + </a> + </div> + <div class="pull-right"> </div> + <div class="clear"><hr/></div> + </div> + + <div id="breadcrumbs"> + <ul class="breadcrumb"> + + + <li class=""> + <a href="../../" title="Apache"> + Apache</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../../" title="Oozie"> + Oozie</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../" title="docs"> + docs</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="./" title="5.0.0"> + 5.0.0</a> + <span class="divider">/</span> + </li> + <li class="active ">Oozie - </li> + + + + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-04-09</li> + <li id="projectVersion" class="pull-right"> + Version: 5.0.0 + </li> + + </ul> + </div> + + + <div class="row-fluid"> + <div id="leftColumn" class="span2"> + <div class="well sidebar-nav"> + + + <ul class="nav nav-list"> + </ul> + + + + <hr /> + + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> + <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> + </a> + </div> + </div> + </div> + + + <div id="bodyColumn" class="span10" > + + <p></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<a name="Oozie_Quick_Start"></a> +<div class="section"><h2> Oozie Quick Start</h2> +<p>These instructions install and run Oozie using an embedded Jetty server and an embedded Derby database.</p> +<p>For detailed install and configuration instructions refer to <a href="./AG_Install.html">Oozie Install</a> +.</p> +<p><ul><ul><li><a href="#Building_Oozie">Building Oozie</a> +<ul><li><a href="#System_Requirements:">System Requirements:</a> +</li> +<li><a href="#Building_Oozie_">Building Oozie</a> +</li> +</ul> +</li> +<li><a href="#Server_Installation">Server Installation</a> +<ul><li><a href="#System_Requirements">System Requirements</a> +</li> +<li><a href="#Server_Installation_">Server Installation</a> +</li> +</ul> +</li> +<li><a href="#Client_Installation">Client Installation</a> +<ul><li><a href="#System_Requirements_">System Requirements</a> +</li> +<li><a href="#Client_Installation_">Client Installation</a> +</li> +</ul> +</li> +<li><a href="#Oozie_Share_Lib_Installation">Oozie Share Lib Installation</a> +</li> +</ul> +</ul> +</p> +<a name="Building_Oozie"></a> +<div class="section"><h3>Building Oozie</h3> +<a name="System_Requirements:"></a> +<div class="section"><h4>System Requirements:</h4> +<p><ul><li>Unix box (tested on Mac OS X and Linux)</li> +<li>Java JDK 1.8+</li> +<li>Maven 3.0.1+</li> +<li>Hadoop 2.6.0+</li> +<li>Pig 0.10.1+</li> +</ul> +</p> +<p>JDK commands (java, javac) must be in the command path.</p> +<p>The Maven command (mvn) must be in the command path.</p> +<a name="Building_Oozie_"></a> +</div> +<div class="section"><h4>Building Oozie</h4> +<p>Download a source distribution of Oozie from the "Releases" drop down menu on the <a class="externalLink" href="http://oozie.apache.org">Oozie site</a> +.</p> +<p>Expand the source distribution <tt>tar.gz</tt> + and change directories into it.</p> +<p>The simplest way to build Oozie is to run the <tt>mkdistro.sh</tt> + script: +<pre> +$ bin/mkdistro.sh [-DskipTests]Running =mkdistro.sh= will create the binary distribution of Oozie. By default, oozie war will not contain hadoop and +hcatalog libraries, however they are required for oozie to work. There are 2 options to add these libraries: +1. At install time, copy the hadoop and hcatalog libraries to libext and run oozie-setup.sh to setup Oozie. This is +suitable when same oozie package needs to be used in multiple set-ups with different hadoop/hcatalog versions. +2. Build with -Puber which will bundle the required libraries in the oozie war. Further, the following options are +available to customise the versions of the dependencies: +-Dhadoop.version=<version> - default 2.6.0 +-Ptez - Bundle tez jars in hive and pig sharelibs. Useful if you want to use tez ++as the execution engine for those applications. +-Dpig.version=<version> - default 0.16.0 +-Dpig.classifier=<classifier> - default h2 +-Dsqoop.version=<version> - default 1.4.3 +-Dsqoop.classifier=<classifier> - default hadoop100 +-Djetty.version=<version> - default 9.2.19.v20160908 +-Dopenjpa.version=<version> - default 2.2.2 +-Dxerces.version=<version> - default 2.10.0 +-Dcurator.version=<version> - default 2.5.0 +-Dhive.version=<version - default 1.2.0 +-Dhbase.version=<version> - default 1.2.3 +-Dtez.version=<version> - default 0.8.4 +</pre> +</p> +<p>More details on building Oozie can be found on the <a href="./ENG_Building.html">Building Oozie</a> + page.</p> +<a name="Server_Installation"></a> +</div> +</div> +<div class="section"><h3>Server Installation</h3> +<a name="System_Requirements"></a> +<div class="section"><h4>System Requirements</h4> +<p><ul><li>Unix (tested in Linux and Mac OS X)</li> +<li>Java 1.8+</li> +<li>Hadoop<ul><li><a class="externalLink" href="http://hadoop.apache.org">Apache Hadoop</a> + (tested with 1.2.1 & 2.6.0+)</li> +</ul> +</li> +<li>ExtJS library (optional, to enable Oozie webconsole)<ul><li><a class="externalLink" href="http://archive.cloudera.com/gplextras/misc/ext-2.2.zip">ExtJS 2.2</a> +</li> +</ul> +</li> +</ul> +</p> +<p>The Java 1.8+ <tt>bin</tt> + directory should be in the command path.</p> +<a name="Server_Installation_"></a> +</div> +<div class="section"><h4>Server Installation</h4> +<p><b>IMPORTANT:</b> + Oozie ignores any set value for <tt>OOZIE_HOME</tt> +, Oozie computes its home automatically.</p> +<p><ul><li>Build an Oozie binary distribution</li> +<li>Download a Hadoop binary distribution</li> +<li>Download ExtJS library (it must be version 2.2)</li> +</ul> +</p> +<p><b>NOTE:</b> + The ExtJS library is not bundled with Oozie because it uses a different license.</p> +<p><b>NOTE:</b> + Oozie UI browser compatibility Chrome (all), Firefox (3.5), Internet Explorer (8.0), Opera (10.5).</p> +<p><b>NOTE:</b> + It is recommended to use a Oozie Unix user for the Oozie server.</p> +<p>Expand the Oozie distribution <tt>tar.gz</tt> +.</p> +<p>Expand the Hadoop distribution <tt>tar.gz</tt> + (as the Oozie Unix user).</p> +<p><a name="HadoopProxyUser"></a> +</p> +<p><b>NOTE:</b> + Configure the Hadoop cluster with proxyuser for the Oozie process.</p> +<p>The following two properties are required in Hadoop core-site.xml:</p> +<p><pre> + <!-- OOZIE --> + <property> + <name>hadoop.proxyuser.[OOZIE_SERVER_USER].hosts</name> + <value>[OOZIE_SERVER_HOSTNAME]</value> + </property> + <property> + <name>hadoop.proxyuser.[OOZIE_SERVER_USER].groups</name> + <value>[USER_GROUPS_THAT_ALLOW_IMPERSONATION]</value> + </property> +</pre></p> +<p>Replace the capital letter sections with specific values and then restart Hadoop.</p> +<p>The ExtJS library is optional (only required for the Oozie web-console to work)</p> +<p><b>IMPORTANT:</b> + all Oozie server scripts (=oozie-setup.sh=, <tt>oozied.sh</tt> +, <tt>oozie-start.sh</tt> +, <tt>oozie-run.sh</tt> + +and <tt>oozie-stop.sh</tt> +) run only under the Unix user that owns the Oozie installation directory, +if necessary use <tt>sudo -u OOZIE_USER</tt> + when invoking the scripts.</p> +<p>As of Oozie 3.3.2, use of <tt>oozie-start.sh</tt> +, <tt>oozie-run.sh</tt> +, and <tt>oozie-stop.sh</tt> + has +been deprecated and will print a warning. The <tt>oozied.sh</tt> + script should be used +instead; passing it <tt>start</tt> +, <tt>run</tt> +, or <tt>stop</tt> + as an argument will perform the +behaviors of <tt>oozie-start.sh</tt> +, <tt>oozie-run.sh</tt> +, and <tt>oozie-stop.sh</tt> + respectively.</p> +<p>Create a <b>libext/</b> + directory in the directory where Oozie was expanded.</p> +<p>If using the ExtJS library copy the ZIP file to the <b>libext/</b> + directory. If hadoop and hcatalog libraries are not +already included in the war, add the corresponding libraries to <b>libext/</b> + directory.</p> +<p>A "sharelib create -fs fs_default_name [-locallib sharelib]" command is available when running oozie-setup.sh +for uploading new sharelib into hdfs where the first argument is the default fs name +and the second argument is the Oozie sharelib to install, it can be a tarball or the expanded version of it. +If the second argument is omitted, the Oozie sharelib tarball from the Oozie installation directory will be used. +Upgrade command is deprecated, one should use create command to create new version of sharelib. +Sharelib files are copied to new lib_<timestamped> directory. At start, server picks the sharelib from latest time-stamp directory. +While starting server also purge sharelib directory which is older than sharelib retention days +(defined as oozie.service.ShareLibService.temp.sharelib.retention.days and 7 days is default).</p> +<p>db create|upgrade|postupgrade -run [-sqlfile <FILE>] command is for create, upgrade or postupgrade oozie db with an +optional sql file</p> +<p>Run the <tt>oozie-setup.sh</tt> + script to configure Oozie with all the components added to the <b>libext/</b> + directory.</p> +<p><pre> +$ bin/oozie-setup.sh sharelib create -fs <FS_URI> [-locallib <PATH>] + sharelib upgrade -fs <FS_URI> [-locallib <PATH>] + db create|upgrade|postupgrade -run [-sqlfile <FILE>] +</pre></p> +<p><b>IMPORTANT</b> +: If the Oozie server needs to establish secure connection with an external server with a self-signed certificate, +make sure you specify the location of a truststore that contains required certificates. It can be done by configuring +=oozie.https.truststore.file= in <tt>oozie-site.xml</tt> +, or by setting the <tt>javax.net.ssl.trustStore</tt> + system property. +If it is set in both places, the value passed as system property will be used.</p> +<p>Create the Oozie DB using the 'ooziedb.sh' command line tool:</p> +<p><pre> +$ bin/ooziedb.sh create -sqlfile oozie.sql -runValidate DB Connection. +DONE +Check DB schema does not exist +DONE +Check OOZIE_SYS table does not exist +DONE +Create SQL schema +DONE +DONE +Create OOZIE_SYS table +DONE +Oozie DB has been created for Oozie version '3.2.0' +$ +</pre> +</p> +<p>Start Oozie as a daemon process run:</p> +<p><pre> +$ bin/oozied.sh start +</pre></p> +<p>To start Oozie as a foreground process run:</p> +<p><pre> +$ bin/oozied.sh run +</pre></p> +<p>Check the Oozie log file <tt>logs/oozie.log</tt> + to ensure Oozie started properly.</p> +<p>Using the Oozie command line tool check the status of Oozie:</p> +<p><pre> +$ bin/oozie admin -oozie http://localhost:11000/oozie -status +</pre></p> +<p>Using a browser go to the <a class="externalLink" href="http://localhost:11000/oozie">Oozie web console</a> +, Oozie status should be <b>NORMAL</b> +.</p> +<p>Refer to the <a href="./DG_Examples.html">Running the Examples</a> + document for details on running the examples.</p> +<a name="Client_Installation"></a> +</div> +</div> +<div class="section"><h3>Client Installation</h3> +<a name="System_Requirements_"></a> +<div class="section"><h4>System Requirements</h4> +<p><ul><li>Unix (tested in Linux and Mac OS X)</li> +<li>Java 1.8+</li> +</ul> +</p> +<p>The Java 1.8+ <tt>bin</tt> + directory should be in the command path.</p> +<a name="Client_Installation_"></a> +</div> +<div class="section"><h4>Client Installation</h4> +<p>Copy and expand the <tt>oozie-client</tt> + TAR.GZ file bundled with the distribution. Add the <tt>bin/</tt> + directory to the <tt>PATH</tt> +.</p> +<p>Refer to the <a href="./DG_CommandLineTool.html">Command Line Interface Utilities</a> + document for a full reference of the <tt>oozie</tt> + +command line tool.</p> +<p>NOTE: The Oozie server installation includes the Oozie client. The Oozie client should be installed in remote machines +only.</p> +<p><a name="OozieShareLib"></a> +</p> +<a name="Oozie_Share_Lib_Installation"></a> +</div> +</div> +<div class="section"><h3>Oozie Share Lib Installation</h3> +<p>Oozie share lib has been installed by oozie-setup.sh create command explained in the earlier section.</p> +<p>See the <a href="./WorkflowFunctionalSpec.html#ShareLib">Workflow Functional Specification</a> + and <a href="./AG_Install.html#Oozie_Share_Lib">Installation</a> + for more information about the Oozie ShareLib.</p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<p></p> +</div> + + </div> + </div> + </div> + + <hr/> + + <footer> + <div class="container-fluid"> + <div class="row-fluid"> + <p >Copyright © 2018 + <a href="http://www.apache.org">Apache Software Foundation</a>. + All rights reserved. + + </p> + </div> + + + </div> + </footer> + </body> +</html>
Added: websites/staging/oozie/trunk/content/docs/5.0.0/DG_SLAMonitoring.html ============================================================================== --- websites/staging/oozie/trunk/content/docs/5.0.0/DG_SLAMonitoring.html (added) +++ websites/staging/oozie/trunk/content/docs/5.0.0/DG_SLAMonitoring.html Mon Apr 9 14:26:49 2018 @@ -0,0 +1,619 @@ +<!DOCTYPE html> +<!-- + | Generated by Apache Maven Doxia at Apr 9, 2018 + | Rendered using Apache Maven Fluido Skin 1.4 +--> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta charset="UTF-8" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta http-equiv="Content-Language" content="en" /> + <title>Oozie - </title> + <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" /> + <link rel="stylesheet" href="./css/site.css" /> + <link rel="stylesheet" href="./css/print.css" media="print" /> + + + <script type="text/javascript" src="./js/apache-maven-fluido-1.4.min.js"></script> + + + </head> + <body class="topBarDisabled"> + + + + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"> + <a href="https://oozie.apache.org/" id="bannerLeft"> + <img src="https://oozie.apache.org/images/oozie_200x.png" alt="Oozie"/> + </a> + </div> + <div class="pull-right"> </div> + <div class="clear"><hr/></div> + </div> + + <div id="breadcrumbs"> + <ul class="breadcrumb"> + + + <li class=""> + <a href="../../" title="Apache"> + Apache</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../../" title="Oozie"> + Oozie</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../" title="docs"> + docs</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="./" title="5.0.0"> + 5.0.0</a> + <span class="divider">/</span> + </li> + <li class="active ">Oozie - </li> + + + + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-04-09</li> + <li id="projectVersion" class="pull-right"> + Version: 5.0.0 + </li> + + </ul> + </div> + + + <div class="row-fluid"> + <div id="leftColumn" class="span2"> + <div class="well sidebar-nav"> + + + <ul class="nav nav-list"> + </ul> + + + + <hr /> + + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> + <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> + </a> + </div> + </div> + </div> + + + <div id="bodyColumn" class="span10" > + + <p></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<a name="Oozie_SLA_Monitoring"></a> +<div class="section"><h2> Oozie SLA Monitoring</h2> +<p><ul><ul><li><a href="#Overview">Overview</a> +</li> +<li><a href="#Oozie_Server_Configuration">Oozie Server Configuration</a> +</li> +<li><a href="#SLA_Tracking">SLA Tracking</a> +<ul><li><a href="#Event_Status">Event Status</a> +</li> +<li><a href="#SLA_Status">SLA Status</a> +</li> +</ul> +</li> +<li><a href="#Configuring_SLA_in_Applications">Configuring SLA in Applications</a> +<ul><li><a href="#SLA_Definition_in_Workflow">SLA Definition in Workflow</a> +</li> +<li><a href="#SLA_Definition_in_Workflow_Action">SLA Definition in Workflow Action</a> +</li> +<li><a href="#SLA_Definition_in_Coordinator_Action">SLA Definition in Coordinator Action</a> +</li> +</ul> +</li> +<li><a href="#Accessing_SLA_Information">Accessing SLA Information</a> +<ul><li><a href="#Scenario_1:_Workflow_Job_Start_Miss">Scenario 1: Workflow Job Start_Miss</a> +</li> +<li><a href="#Scenario_2:_Workflow_Action_End_Miss">Scenario 2: Workflow Action End_Miss</a> +</li> +<li><a href="#Scenario_3:_Coordinator_Action_Duration_Miss">Scenario 3: Coordinator Action Duration_Miss</a> +</li> +<li><a href="#Scenario_4:_All_Coordinator_actions_in_a_Bundle">Scenario 4: All Coordinator actions in a Bundle</a> +</li> +<li><a href="#Sample_Email_Alert">Sample Email Alert</a> +</li> +<li><a href="#Changing_job_SLA_definition_and_alerting">Changing job SLA definition and alerting</a> +<ul><li><a href="#a1._Specify_in_Bundle_XML_during_submission.">1. Specify in Bundle XML during submission.</a> +</li> +<li><a href="#a2._Specify_during_Coordinator_job_submission_or_update">2. Specify during Coordinator job submission or update</a> +</li> +<li><a href="#a3._Change_using_command_line">3. Change using command line</a> +</li> +<li><a href="#a4._Change_using_REST_API">4. Change using REST API</a> +</li> +</ul> +</li> +</ul> +</li> +<li><a href="#Known_issues">Known issues</a> +</li> +</ul> +</ul> +</p> +<a name="Overview"></a> +<div class="section"><h3>Overview</h3> +<p>Critical jobs can have certain SLA requirements associated with them. This SLA can be in terms of time +i.e. a maximum allowed time limit associated with when the job should start, by when should it end, +and its duration of run. Oozie workflows and coordinators allow defining such SLA limits in the application definition xml.</p> +<p>With the addition of SLA Monitoring, Oozie can now actively monitor the state of these SLA-sensitive jobs +and send out notifications for SLA mets and misses.</p> +<p>In versions earlier than 4.x, this was a passive feature where users needed to query the Oozie client SLA API +to fetch the records regarding job status changes, and use their own custom calculation engine to compute +whether SLA was met or missed, based on initial definition of time limits.</p> +<p>Oozie now also has a SLA tab in the Oozie UI, where users can query for SLA information and have a summarized view +of how their jobs fared against their SLAs.</p> +<a name="Oozie_Server_Configuration"></a> +</div> +<div class="section"><h3>Oozie Server Configuration</h3> +<p>Refer to <a href="./AG_Install.html#Notifications_Configuration">Notifications Configuration</a> + for configuring Oozie server to track +SLA for jobs and send notifications.</p> +<a name="SLA_Tracking"></a> +</div> +<div class="section"><h3>SLA Tracking</h3> +<p>Oozie allows tracking SLA for meeting the following criteria:<ul><li>Start time</li> +<li>End time</li> +<li>Job Duration</li> +</ul> +</p> +<a name="Event_Status"></a> +<div class="section"><h5>Event Status</h5> +<p>Corresponding to each of these 3 criteria, your jobs are processed for whether Met or Miss i.e.<ul><li>START_MET, START_MISS</li> +<li>END_MET, END_MISS</li> +<li>DURATION_MET, DURATION_MISS</li> +</ul> +</p> +<a name="SLA_Status"></a> +</div> +<div class="section"><h5>SLA Status</h5> +<p>Expected end-time is the most important criterion for majority of users while deciding overall SLA Met or Miss. +Hence the <i>"SLA</i> +Status"_ for a job will transition through these four stages<ul><li>Not_Started <-- Job not yet begun</li> +<li>In_Process <-- Job started and is running, and SLAs are being tracked</li> +<li>Met <-- caused by an END_MET</li> +<li>Miss <-- caused by an END_MISS</li> +</ul> +</p> +<p>In addition to overshooting expected end-time, and END_MISS (and so an eventual SLA MISS) also occurs when the +job does not end successfully e.g. goes to error state - Failed/Killed/Error/Timedout.</p> +<a name="Configuring_SLA_in_Applications"></a> +</div> +</div> +<div class="section"><h3>Configuring SLA in Applications</h3> +<p>To make your jobs trackable for SLA, you simply need to add the <tt><sla:info></tt> + tag to your workflow application definition. +If you were already using the existing SLA schema in your workflows (Schema xmlns:sla="uri:oozie:sla:0.1"), you don't need to +do anything extra to receive SLA notifications via JMS messages. This new SLA monitoring framework is backward-compatible - +no need to change application XML for now and you can continue to fetch old records via the <a href="./DG_CommandLineTool.html#SLAOperations">command line API</a> +. +However, usage of old schema and API is deprecated and we strongly recommend using new schema.<ul><li>New SLA schema is 'uri:oozie:sla:0.2'</li> +<li>In order to use new SLA schema, you will need to upgrade your workflow/coordinator schema to 0.5 i.e. 'uri:oozie:workflow:0.5'</li> +</ul> +</p> +<a name="SLA_Definition_in_Workflow"></a> +<div class="section"><h4>SLA Definition in Workflow</h4> +<p>Example: +<pre> +<workflow-app name="test-wf-job-sla" + xmlns="uri:oozie:workflow:0.5" + xmlns:sla="uri:oozie:sla:0.2"> + <start to="grouper"/> + <action name="grouper"> + <map-reduce> + <job-tracker>jt</job-tracker> + <name-node>nn</name-node> + <configuration> + <property> + <name>mapred.input.dir</name> + <value>input</value> + </property> + <property> + <name>mapred.output.dir</name> + <value>output</value> + </property> + </configuration> + </map-reduce> + <ok to="end"/> + <error to="end"/> + </action> + <end name="end"/> + <sla:info> + <sla:nominal-time>${nominal_time}</sla:nominal-time> + <sla:should-start>${10 * MINUTES}</sla:should-start> + <sla:should-end>${30 * MINUTES}</sla:should-end> + <sla:max-duration>${30 * MINUTES}</sla:max-duration> + <sla:alert-events>start_miss,end_miss,duration_miss</sla:alert-events> + <sla:alert-contact>[email protected]</sla:alert-contact> + </sla:info> +</workflow-app> +</pre></p> +<p>For the list of tags usable under <tt><sla:info></tt> +, refer to <a href="./WorkflowFunctionalSpec.html#SLASchema">Schemas Appendix</a> +. +This new schema is much more compact and meaningful, getting rid of redundant and unused tags.</p> +<p><ul><li><b><tt>nominal-time</tt> +</b> +: As the name suggests, this is the time relative to which your jobs' SLAs will be calculated. Generally since Oozie workflows are aligned with synchronous data dependencies, this nominal time can be parameterized to be passed the value of your coordinator nominal time. Nominal time is also required in case of independent workflows and you can specify the time in which you expect the workflow to be run if you don't have a synchronous dataset associated with it.</li> +<li><b><tt>should-start</tt> +</b> +: Relative to <tt>nominal-time</tt> + this is the amount of time (along with time-unit - MINUTES, HOURS, DAYS) within which your job should <b>start running</b> + to meet SLA. This is optional.</li> +<li><b><tt>should-end</tt> +</b> +: Relative to <tt>nominal-time</tt> + this is the amount of time (along with time-unit - MINUTES, HOURS, DAYS) within which your job should <b>finish</b> + to meet SLA.</li> +<li><b><tt>max-duration</tt> +</b> +: This is the maximum amount of time (along with time-unit - MINUTES, HOURS, DAYS) your job is expected to run. This is optional.</li> +<li><b><tt>alert-events</tt> +</b> +: Specify the types of events for which <b>Email</b> + alerts should be sent. Allowable values in this comma-separated list are start_miss, end_miss and duration_miss. *_met events can generally be deemed low priority and hence email alerting for these is not necessary. However, note that this setting is only for alerts via <b>email</b> + alerts and not via JMS messages, where all events send out notifications, and user can filter them using desired selectors. This is optional and only applicable when alert-contact is configured.</li> +<li><b><tt>alert-contact</tt> +</b> +: Specify a comma separated list of email addresses where you wish your alerts to be sent. This is optional and need not be configured if you just want to view your job SLA history in the UI and do not want to receive email alerts.</li> +</ul> +</p> +<p>NOTE: All tags can be parameterized as a EL function or a fixed value.</p> +<p>Same schema can be applied to and embedded under Workflow-Action as well as Coordinator-Action XML.</p> +<a name="SLA_Definition_in_Workflow_Action"></a> +</div> +<div class="section"><h4>SLA Definition in Workflow Action</h4> +<p><pre> +<workflow-app name="test-wf-action-sla" xmlns="uri:oozie:workflow:0.5" xmlns:sla="uri:oozie:sla:0.2"> + <start to="grouper"/> + <action name="grouper"> + ... + <ok to="end"/> + <error to="end"/> + <sla:info> + <sla:nominal-time>${nominal_time}</sla:nominal-time> + <sla:should-start>${10 * MINUTES}</sla:should-start> + ... + </sla:info> + </action> + <end name="end"/> +</workflow-app> +</pre></p> +<a name="SLA_Definition_in_Coordinator_Action"></a> +</div> +<div class="section"><h4>SLA Definition in Coordinator Action</h4> +<p><pre> +<coordinator-app name="test-coord-sla" frequency="${coord:days(1)}" freq_timeunit="DAY" + end_of_duration="NONE" start="2013-06-20T08:01Z" end="2013-12-01T08:01Z" + timezone="America/Los_Angeles" xmlns="uri:oozie:coordinator:0.4" xmlns:sla="uri:oozie:sla:0.2"> + <action> + <workflow> + <app-path>${wfAppPath}</app-path> + </workflow> + <sla:info> + <sla:nominal-time>${nominal_time}</sla:nominal-time> + ... + </sla:info> + </action> +</coordinator-app> +</pre></p> +<a name="Accessing_SLA_Information"></a> +</div> +</div> +<div class="section"><h3>Accessing SLA Information</h3> +<p>SLA information is accessible via the following ways:<ul><li>Through the SLA tab of the Oozie Web UI.</li> +<li>JMS messages sent to a configured JMS provider for instantaneous tracking.</li> +<li>RESTful API to query for SLA summary.</li> +<li>As an <tt>Instrumentation.Counter</tt> + entry that is accessible via RESTful API and reflects to the number of all SLA tracked external</li> +</ul> +entities. Name of this counter is <tt>sla-calculator.sla-map</tt> +.</p> +<p>For JMS Notifications, you have to have a message broker in place, on which Oozie publishes messages and you can +hook on a subscriber to receive those messages. For more info on setting up and consuming JMS messages, refer +<a href="./DG_JMSNotifications.html">JMS Notifications</a> + documentation.</p> +<p>In the REST API, the following filters can be applied while fetching SLA information:<ul><li>app_name - Application name</li> +<li>id - id of the workflow job, workflow action or coordinator action</li> +<li>parent_id - Parent id of the workflow job, workflow action or coordinator action</li> +<li>nominal_start and nominal_end - Start and End range for nominal time of the workflow or coordinator.</li> +<li>bundle - Bundle Job ID or Bundle App Name. Fetches SLA information for actions of all coordinators in that bundle.</li> +<li>event_status - event status such as START_MET/START_MISS/DURATION_MET/DURATION_MISS/END_MET/END_MISS</li> +<li>sla_status - sla status such as NOT_STARTED/IN_PROCESS/MET/MISS</li> +</ul> +</p> +<p>multiple event_status and sla_status can be specified with comma separation. When multiple statuses are specified, they are considered as OR. +For example, event_status=START_MET;END_MISS list the coordinator actions where event status is either START_MET OR END_MISS.</p> +<p>When timezone query parameter is specified, the expected and actual start/end time returned is formatted. If not specified, +the number of milliseconds that have elapsed since January 1, 1970 00:00:00.000 GMT is returned.</p> +<p>The examples below demonstrate the use of REST API and explains the JSON response.</p> +<a name="Scenario_1:_Workflow_Job_Start_Miss"></a> +<div class="section"><h4>Scenario 1: Workflow Job Start_Miss</h4> +<p><b>Request:</b> + +<pre> +GET <oozie-host>:<port>/oozie/v2/sla?timezone=GMT&filter=nominal_start=2013-06-18T00:01Z;nominal_end=2013-06-23T00:01Z;app_name=my-sla-app +</pre></p> +<p><b>JSON Response</b> + +<pre> +{ id : "000056-1238791320234-oozie-joe-W" + parentId : "000001-1238791320234-oozie-joe-C@8" + appType : "WORKFLOW_JOB" + msgType : "SLA" + appName : "my-sla-app" + slaStatus : "IN_PROCESS" + jobStatus : "RUNNING" + user: "joe" + nominalTime: "2013-16-22T05:00Z" + expectedStartTime: "2013-16-22T05:10Z" <-- (should start by this time) + actualStartTime: "2013-16-22T05:30Z" <-- (20 min late relative to expected start) + expectedEndTime: "2013-16-22T05:40Z" <-- (should end by this time) + actualEndTime: null + expectedDuration: 900000 <-- (expected duration in milliseconds) + actualDuration: 120000 <-- (actual duration in milliseconds) + notificationMessage: "My Job has encountered an SLA event!" + upstreamApps: "dependent-app-1, dependent-app-2" +} +</pre> +</p> +<a name="Scenario_2:_Workflow_Action_End_Miss"></a> +</div> +<div class="section"><h4>Scenario 2: Workflow Action End_Miss</h4> +<p><b>Request:</b> + +<pre> +GET <oozie-host>:<port>/oozie/v2/sla?timezone=GMT&filter=parent_id=000056-1238791320234-oozie-joe-W +</pre></p> +<p><b>JSON Response</b> + +<pre> +{ id : "000056-1238791320234-oozie-joe-W@map-reduce-action" + parentId : "000056-1238791320234-oozie-joe-W" + appType : "WORKFLOW_ACTION" + msgType : "SLA" + appName : "map-reduce-action" + slaStatus : "MISS" + jobStatus : "SUCCEEDED" + user: "joe" + nominalTime: "2013-16-22T05:00Z" + expectedStartTime: "2013-16-22T05:10Z" + actualStartTime: "2013-16-22T05:05Z" + expectedEndTime: "2013-16-22T05:40Z" <-- (should end by this time) + actualEndTime: "2013-16-22T06:00Z" <-- (20 min late relative to expected end) + expectedDuration: 3600000 <-- (expected duration in milliseconds) + actualDuration: 3300000 <-- (actual duration in milliseconds) + notificationMessage: "My Job has encountered an SLA event!" + upstreamApps: "dependent-app-1, dependent-app-2" +} +</pre> +</p> +<a name="Scenario_3:_Coordinator_Action_Duration_Miss"></a> +</div> +<div class="section"><h4>Scenario 3: Coordinator Action Duration_Miss</h4> +<p><b>Request:</b> + +<pre> +GET <oozie-host>:<port>/oozie/v2/sla?timezone=GMT&filter=id=000001-1238791320234-oozie-joe-C +</pre></p> +<p><b>JSON Response</b> + +<pre> +{ id : "000001-1238791320234-oozie-joe-C@2" + parentId : "000001-1238791320234-oozie-joe-C" + appType : "COORDINATOR_ACTION" + msgType : "SLA" + appName : "my-coord-app" + slaStatus : "MET" + jobStatus : "SUCCEEDED" + user: "joe" + nominalTime: "2013-16-22T05:00Z" + expectedStartTime: "2013-16-22T05:10Z" + actualStartTime: "2013-16-22T05:05Z" + expectedEndTime: "2013-16-22T05:40Z" + actualEndTime: "2013-16-22T05:30Z" + expectedDuration: 900000 <-- (expected duration in milliseconds) + actualDuration: 1500000 <- (actual duration in milliseconds) + notificationMessage: "My Job has encountered an SLA event!" + upstreamApps: "dependent-app-1, dependent-app-2" +} +</pre> +</p> +<p>Scenario #3 is particularly interesting because it is an overall "MET" because it met its expected End-time, +but it is "Duration_Miss" because the actual run (between actual start and actual end) exceeded expected duration.</p> +<a name="Scenario_4:_All_Coordinator_actions_in_a_Bundle"></a> +</div> +<div class="section"><h4>Scenario 4: All Coordinator actions in a Bundle</h4> +<p><b>Request:</b> + +<pre> +GET <oozie-host>:<port>/oozie/v2/sla?timezone=GMT&filter=bundle=1234567-150130225116604-oozie-B;event_status=END_MISS +</pre></p> +<p><b>JSON Response</b> + +<pre> +{ + id : "000001-1238791320234-oozie-joe-C@1" + parentId : "000001-1238791320234-oozie-joe-C" + appType : "COORDINATOR_ACTION" + msgType : "SLA" + appName : "my-coord-app" + slaStatus : "MET" + eventStatus : "START_MET,DURATION_MISS,END_MISS" + user: "joe" + nominalTime: "2014-01-10T12:00Z" + expectedStartTime: "2014-01-10T12:00Z" + actualStartTime: "2014-01-10T11:59Z" + startDelay: -1 + expectedEndTime: "2014-01-10T13:00Z" + actualEndTime: "2014-01-10T13:05Z" + endDelay: 5 + expectedDuration: 3600000 <-- (expected duration in milliseconds) + actualDuration: 3960000 <-- (actual duration in milliseconds) + durationDelay: 6 <-- (duration delay in minutes) +} +{ + id : "000001-1238791320234-oozie-joe-C@2" + parentId : "000001-1238791320234-oozie-joe-C" + appType : "COORDINATOR_ACTION" + msgType : "SLA" + appName : "my-coord-app" + slaStatus : "MET" + eventStatus : "START_MISS,DURATION_MET,END_MISS" + user: "joe" + nominalTime: "2014-01-11T12:00Z" + expectedStartTime: "2014-01-11T12:00Z" + actualStartTime: "2014-01-11T12:05Z" + startDelay: 5 + expectedEndTime: "2014-01-11T13:00Z" + actualEndTime: "2014-01-11T13:01Z" + endDelay: 1 + expectedDuration: 3600000 <-- (expected duration in milliseconds) + actualDuration: 3360000 <-- (actual duration in milliseconds) + durationDelay: -4 <-- (duration delay in minutes) +} +</pre></p> +<p>Scenario #4 (All Coordinator actions in a Bundle) is to get SLA information of all coordinator actions under bundle job in one call. +startDelay/durationDelay/endDelay values returned indicate how much delay compared to expected time (positive values in case of MISS, and negative values in case of MET).</p> +<a name="Sample_Email_Alert"></a> +</div> +<div class="section"><h4>Sample Email Alert</h4> +<p><pre> +Subject: OOZIE - SLA END_MISS (AppName=wf-sla-job, JobID=0000004-130610225200680-oozie-oozi-W)Status: + SLA Status - END_MISS + Job Status - RUNNING + Notification Message - Missed SLA for Data Pipeline job +Job Details: + App Name - wf-sla-job + App Type - WORKFLOW_JOB + User - strat_ci + Job ID - 0000004-130610225200680-oozie-oozi-W + Job URL - http://host.domain.com:4080/oozie//?job=0000004-130610225200680-oozie-oozi-W + Parent Job ID - N/A + Parent Job URL - N/A + Upstream Apps - wf-sla-up-app +SLA Details: + Nominal Time - Mon Jun 10 23:33:00 UTC 2013 + Expected Start Time - Mon Jun 10 23:35:00 UTC 2013 + Actual Start Time - Mon Jun 10 23:34:04 UTC 2013 + Expected End Time - Mon Jun 10 23:38:00 UTC 2013 + Expected Duration (in mins) - 5 + Actual Duration (in mins) - -1 +</pre> +</p> +<a name="Changing_job_SLA_definition_and_alerting"></a> +</div> +<div class="section"><h4>Changing job SLA definition and alerting</h4> +<p>Following are ways to enable/disable SLA alerts for coordinator actions.</p> +<a name="a1._Specify_in_Bundle_XML_during_submission."></a> +<div class="section"><h5>1. Specify in Bundle XML during submission.</h5> +<p>Following properties can be specified in bundle xml as properties for coordinator.</p> +<p><tt>oozie.sla.disable.alerts.older.than</tt> + this property can be specified in hours, the SLA notification for +coord actions will be disabled whose nominal is time older then this value. Default is 48 hours. +<pre> +<property> + <name>oozie.sla.disable.alerts.older.than</name> + <value>12</value> +</property> +</pre></p> +<p><tt>oozie.sla.disable.alerts</tt> + List of coord actions to be disabled. Value can be specified as list of coord actions or date range. +<pre> +<property> + <name>oozie.sla.disable.alerts</name> + <value>1,3-4,7-10</value> +</property> +</pre> +Will disable alert for coord actions 1,3,5,7,8,9,10</p> +<p><tt>oozie.sla.enable.alerts</tt> + List of coord actions to be enabled. Value can be specified as list of coord actions or date range. +<pre> +<property> + <name>oozie.sla.enable.alerts</name> + <value>2009-01-01T01:00Z::2009-05-31T23:59Z</value> +</property> +</pre> +This will enable SLA alert for coord actions whose nominal time is in between (inclusive) 2009-01-01T01:00Z and 2009-05-31T23:59Z.</p> +<p>ALL keyword can be specified to specify all actions. Below property will disable SLA notifications for all coord actions. +<pre> +<property> + <name>oozie.sla.disable.alerts</name> + <value>ALL</value> +</property> +</pre></p> +<a name="a2._Specify_during_Coordinator_job_submission_or_update"></a> +</div> +<div class="section"><h5>2. Specify during Coordinator job submission or update</h5> +<p>Above properties can be specified in job.properties in +<a href="./DG_CommandLineTool.html#Updating_coordinator_definition_and_properties">Coord job update command</a> +, +in <a href="./DG_CommandLineTool.html#Submitting_a_Workflow_Coordinator_or_Bundle_Job">Coord job submit command</a> + +or in <a href="./DG_CommandLineTool.html#Running_a_Workflow_Coordinator_or_Bundle_Job">Coord job run command</a> +</p> +<a name="a3._Change_using_command_line"></a> +</div> +<div class="section"><h5>3. Change using command line</h5> +<p>Refer <a href="./DG_CommandLineTool.html#Changing_job_SLA_definition_and_alerting">Changing job SLA definition and alerting</a> + for commandline usage.</p> +<a name="a4._Change_using_REST_API"></a> +</div> +<div class="section"><h5>4. Change using REST API</h5> +<p>Refer the REST API <a href="./WebServicesAPI.html#Changing_job_SLA_definition_and_alerting">Changing job SLA definition and alerting</a> +.</p> +<a name="Known_issues"></a> +</div> +</div> +</div> +<div class="section"><h3>Known issues</h3> +<p>There are two known issues when you define SLA for a workflow action.<ul><li>If there are decision nodes and SLA is defined for a workflow action not in the execution path because of the decision node, you will still get an SLA_MISS notification.</li> +<li>If you have dangling action nodes in your workflow definition and SLA is defined for it, you will still get an SLA_MISS notification.</li> +</ul> +</p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<p></p> +</div> + + </div> + </div> + </div> + + <hr/> + + <footer> + <div class="container-fluid"> + <div class="row-fluid"> + <p >Copyright © 2018 + <a href="http://www.apache.org">Apache Software Foundation</a>. + All rights reserved. + + </p> + </div> + + + </div> + </footer> + </body> +</html> Added: websites/staging/oozie/trunk/content/docs/5.0.0/DG_ShellActionExtension.html ============================================================================== --- websites/staging/oozie/trunk/content/docs/5.0.0/DG_ShellActionExtension.html (added) +++ websites/staging/oozie/trunk/content/docs/5.0.0/DG_ShellActionExtension.html Mon Apr 9 14:26:49 2018 @@ -0,0 +1,612 @@ +<!DOCTYPE html> +<!-- + | Generated by Apache Maven Doxia at Apr 9, 2018 + | Rendered using Apache Maven Fluido Skin 1.4 +--> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta charset="UTF-8" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta http-equiv="Content-Language" content="en" /> + <title>Oozie - </title> + <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" /> + <link rel="stylesheet" href="./css/site.css" /> + <link rel="stylesheet" href="./css/print.css" media="print" /> + + + <script type="text/javascript" src="./js/apache-maven-fluido-1.4.min.js"></script> + + + </head> + <body class="topBarDisabled"> + + + + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"> + <a href="https://oozie.apache.org/" id="bannerLeft"> + <img src="https://oozie.apache.org/images/oozie_200x.png" alt="Oozie"/> + </a> + </div> + <div class="pull-right"> </div> + <div class="clear"><hr/></div> + </div> + + <div id="breadcrumbs"> + <ul class="breadcrumb"> + + + <li class=""> + <a href="../../" title="Apache"> + Apache</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../../" title="Oozie"> + Oozie</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="../" title="docs"> + docs</a> + <span class="divider">/</span> + </li> + <li class=""> + <a href="./" title="5.0.0"> + 5.0.0</a> + <span class="divider">/</span> + </li> + <li class="active ">Oozie - </li> + + + + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-04-09</li> + <li id="projectVersion" class="pull-right"> + Version: 5.0.0 + </li> + + </ul> + </div> + + + <div class="row-fluid"> + <div id="leftColumn" class="span2"> + <div class="well sidebar-nav"> + + + <ul class="nav nav-list"> + </ul> + + + + <hr /> + + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> + <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" /> + </a> + </div> + </div> + </div> + + + <div id="bodyColumn" class="span10" > + + <p><<noautolink></p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<hr /> +<a name="Oozie_Shell_Action_Extension"></a> +<div class="section"><h2> Oozie Shell Action Extension</h2> +<p><ul><ul><li><a href="#Shell_Action">Shell Action</a> +<ul><li><a href="#Shell_Action_Configuration">Shell Action Configuration</a> +</li> +<li><a href="#Shell_Action_Logging">Shell Action Logging</a> +</li> +<li><a href="#Shell_Action_Limitations">Shell Action Limitations</a> +</li> +</ul> +</li> +<li><a href="#Appendix_Shell_XML-Schema">Appendix, Shell XML-Schema</a> +<ul><li><a href="#AE.A_Appendix_A_Shell_XML-Schema">AE.A Appendix A, Shell XML-Schema</a> +<ul><li><a href="#Shell_Action_Schema_Version_1.0">Shell Action Schema Version 1.0</a> +</li> +<li><a href="#Shell_Action_Schema_Version_0.3">Shell Action Schema Version 0.3</a> +</li> +<li><a href="#Shell_Action_Schema_Version_0.2">Shell Action Schema Version 0.2</a> +</li> +<li><a href="#Shell_Action_Schema_Version_0.1">Shell Action Schema Version 0.1</a> +</li> +</ul> +</li> +</ul> +</li> +</ul> +</ul> +</p> +<p><a name="ShellAction"></a> +</p> +<a name="Shell_Action"></a> +<div class="section"><h3>Shell Action</h3> +<p>The <tt>shell</tt> + action runs a Shell command.</p> +<p>The workflow job will wait until the Shell command completes before +continuing to the next action.</p> +<p>To run the Shell job, you have to configure the <tt>shell</tt> + action with the +=job-tracker=, <tt>name-node</tt> + and Shell <tt>exec</tt> + elements as +well as the necessary arguments and configuration.</p> +<p>A <tt>shell</tt> + action can be configured to create or delete HDFS directories +before starting the Shell job.</p> +<p>Shell <i>launcher</i> + configuration can be specified with a file, using the <tt>job-xml</tt> + +element, and inline, using the <tt>configuration</tt> + elements.</p> +<p>Oozie EL expressions can be used in the inline configuration. Property +values specified in the <tt>configuration</tt> + element override values specified +in the <tt>job-xml</tt> + file.</p> +<p>Note that YARN <tt>yarn.resourcemanager.address</tt> + (=resource-manager=) and HDFS <tt>fs.default.name</tt> + (=name-node=) properties +must not be present in the inline configuration.</p> +<p>As with Hadoop <tt>map-reduce</tt> + jobs, it is possible to add files and +archives in order to make them available to the Shell job. Refer to the +[WorkflowFunctionalSpec#FilesArchives][Adding Files and Archives for the Job] +section for more information about this feature.</p> +<p>The output (STDOUT) of the Shell job can be made available to the workflow job after the Shell job ends. This information +could be used from within decision nodes. If the output of the Shell job is made available to the workflow job the shell +command must follow the following requirements:</p> +<p><ul><li>The format of the output must be a valid Java Properties file.</li> +<li>The size of the output must not exceed 2KB.</li> +</ul> +</p> +<p><b>Syntax:</b> +</p> +<p><pre> +<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:1.0"> + ... + <action name="[NODE-NAME]"> + <shell xmlns="uri:oozie:shell-action:1.0"> + <resource-manager>[RESOURCE-MANAGER]</resource-manager> + <name-node>[NAME-NODE]</name-node> + <prepare> + <delete path="[PATH]"/> + ... + <mkdir path="[PATH]"/> + ... + </prepare> + <job-xml>[SHELL SETTINGS FILE]</job-xml> + <configuration> + <property> + <name>[PROPERTY-NAME]</name> + <value>[PROPERTY-VALUE]</value> + </property> + ... + </configuration> + <exec>[SHELL-COMMAND]</exec> + <argument>[ARG-VALUE]</argument> + ... + <argument>[ARG-VALUE]</argument> + <env-var>[VAR1=VALUE1]</env-var> + ... + <env-var>[VARN=VALUEN]</env-var> + <file>[FILE-PATH]</file> + ... + <archive>[FILE-PATH]</archive> + ... + <capture-output/> + </shell> + <ok to="[NODE-NAME]"/> + <error to="[NODE-NAME]"/> + </action> + ... +</workflow-app> +</pre></p> +<p>The <tt>prepare</tt> + element, if present, indicates a list of paths to delete +or create before starting the job. Specified paths must start with <tt><a href="./hdfs://HOST:PORT.html">hdfs://HOST:PORT</a> +</tt> +.</p> +<p>The <tt>job-xml</tt> + element, if present, specifies a file containing configuration +for the Shell job. As of schema 0.2, multiple <tt>job-xml</tt> + elements are allowed in order to +specify multiple <tt>job.xml</tt> + files.</p> +<p>The <tt>configuration</tt> + element, if present, contains configuration +properties that are passed to the Shell job.</p> +<p>The <tt>exec</tt> + element must contain the path of the Shell command to +execute. The arguments of Shell command can then be specified +using one or more <tt>argument</tt> + element.</p> +<p>The <tt>argument</tt> + element, if present, contains argument to be passed to +the Shell command.</p> +<p>The <tt>env-var</tt> + element, if present, contains the environment to be passed +to the Shell command. <tt>env-var</tt> + should contain only one pair of environment variable +and value. If the pair contains the variable such as $PATH, it should follow the +Unix convention such as PATH=$PATH:mypath. Don't use ${PATH} which will be +substituted by Oozie's EL evaluator.</p> +<p>A <tt>shell</tt> + action creates a Hadoop configuration. The Hadoop configuration is made available as a local file to the +Shell application in its running directory. The exact file path is exposed to the spawned shell using the environment +variable called <tt>OOZIE_ACTION_CONF_XML</tt> +.The Shell application can access the environment variable to read the action +configuration XML file path.</p> +<p>If the <tt>capture-output</tt> + element is present, it indicates Oozie to capture output of the STDOUT of the shell command +execution. The Shell command output must be in Java Properties file format and it must not exceed 2KB. From within the +workflow definition, the output of an Shell action node is accessible via the <tt>String action:output(String node, +String key)</tt> + function (Refer to section '4.2.6 Action EL Functions').</p> +<p>All the above elements can be parameterized (templatized) using EL +expressions.</p> +<p><b>Example:</b> +</p> +<p>How to run any shell script or perl script or CPP executable</p> +<p><pre> +<workflow-app xmlns='uri:oozie:workflow:1.0' name='shell-wf'> + <start to='shell1' /> + <action name='shell1'> + <shell xmlns="uri:oozie:shell-action:1.0"> + <resource-manager>${resourceManager}</resource-manager> + <name-node>${nameNode}</name-node> + <configuration> + <property> + <name>mapred.job.queue.name</name> + <value>${queueName}</value> + </property> + </configuration> + <exec>${EXEC}</exec> + <argument>A</argument> + <argument>B</argument> + <file>${EXEC}#${EXEC}</file> <!--Copy the executable to compute node's current working directory --> + </shell> + <ok to="end" /> + <error to="fail" /> + </action> + <kill name="fail"> + <message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> + </kill> + <end name='end' /> +</workflow-app> +</pre></p> +<p>The corresponding job properties file used to submit Oozie job could be as follows:</p> +<p><pre> +oozie.wf.application.path=hdfs://localhost:8020/user/kamrul/workflows/script#Execute is expected to be in the Workflow directory. +#Shell Script to run +EXEC=script.sh +#CPP executable. Executable should be binary compatible to the compute node OS. +#EXEC=hello +#Perl script +#EXEC=script.pl +resourceManager=localhost:8032 +nameNode=hdfs://localhost:8020 +queueName=default +</pre> +</p> +<p>How to run any java program bundles in a jar.</p> +<p><pre> +<workflow-app xmlns='uri:oozie:workflow:1.0' name='shell-wf'> + <start to='shell1' /> + <action name='shell1'> + <shell xmlns="uri:oozie:shell-action:1.0"> + <resource-manager>${resourceManager}</resource-manager> + <name-node>${nameNode}</name-node> + <configuration> + <property> + <name>mapred.job.queue.name</name> + <value>${queueName}</value> + </property> + </configuration> + <exec>java</exec> + <argument>-classpath</argument> + <argument>./${EXEC}:$CLASSPATH</argument> + <argument>Hello</argument> + <file>${EXEC}#${EXEC}</file> <!--Copy the jar to compute node current working directory --> + </shell> + <ok to="end" /> + <error to="fail" /> + </action> + <kill name="fail"> + <message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> + </kill> + <end name='end' /> +</workflow-app> +</pre></p> +<p>The corresponding job properties file used to submit Oozie job could be as follows:</p> +<p><pre> +oozie.wf.application.path=hdfs://localhost:8020/user/kamrul/workflows/script#Hello.jar file is expected to be in the Workflow directory. +EXEC=Hello.jar +resourceManager=localhost:8032 +nameNode=hdfs://localhost:8020 +queueName=default +</pre> +</p> +<a name="Shell_Action_Configuration"></a> +<div class="section"><h4>Shell Action Configuration</h4> +<p><tt>oozie.action.shell.setup.hadoop.conf.dir</tt> + - Generates a config directory with various core/hdfs/yarn/mapred-site.xml files and points <tt>HADOOP_CONF_DIR</tt> + and <tt>YARN_CONF_DIR</tt> + env-vars to it, before the Script is invoked. XML is sourced from the action configuration. Useful when the Shell script passed uses various <tt>hadoop</tt> + commands. Default is false. +=oozie.action.shell.setup.hadoop.conf.dir.write.log4j.properties= - When <tt>oozie.action.shell.setup.hadoop.conf.dir</tt> + is enabled, toggle if a log4j.properties file should also be written under the configuration files directory. Default is true. +=oozie.action.shell.setup.hadoop.conf.dir.log4j.content= - When <tt>oozie.action.shell.setup.hadoop.conf.dir.write.log4j.properties</tt> + is enabled, the content to write into the log4j.properties file under the configuration files directory. Default is a simple console based stderr logger, as presented below: +<pre> +log4j.rootLogger=${hadoop.root.logger} +hadoop.root.logger=INFO,console +log4j.appender.console=org.apache.log4j.ConsoleAppender +log4j.appender.console.target=System.err +log4j.appender.console.layout=org.apache.log4j.PatternLayout +log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n +</pre></p> +<a name="Shell_Action_Logging"></a> +</div> +<div class="section"><h4>Shell Action Logging</h4> +<p>Shell action's stdout and stderr output are redirected to the Oozie Launcher map-reduce job task STDOUT that runs the shell command.</p> +<p>From Oozie web-console, from the Shell action pop up using the 'Console URL' link, it is possible +to navigate to the Oozie Launcher map-reduce job task logs via the Hadoop job-tracker web-console.</p> +<a name="Shell_Action_Limitations"></a> +</div> +<div class="section"><h4>Shell Action Limitations</h4> +<p>Although Shell action can execute any shell command, there are some limitations.<ul><li>No interactive command is supported.</li> +<li>Command can't be executed as different user using sudo.</li> +<li>User has to explicitly upload the required 3rd party packages (such as jar, so lib, executable etc). Oozie provides a way using <file> and <archive> tag through Hadoop's Distributed Cache to upload.</li> +<li>Since Oozie will execute the shell command into a Hadoop compute node, the default installation of utility in the compute node might not be fixed. However, the most common unix utilities are usually installed on all compute nodes. It is important to note that Oozie could only support the commands that are installed into the compute nodes or that are uploaded through Distributed Cache.</li> +</ul> +</p> +<a name="Appendix_Shell_XML-Schema"></a> +</div> +</div> +<div class="section"><h3>Appendix, Shell XML-Schema</h3> +<a name="AE.A_Appendix_A_Shell_XML-Schema"></a> +<div class="section"><h4>AE.A Appendix A, Shell XML-Schema</h4> +<a name="Shell_Action_Schema_Version_1.0"></a> +<div class="section"><h5>Shell Action Schema Version 1.0</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:shell="uri:oozie:shell-action:1.0" + elementFormDefault="qualified" + targetNamespace="uri:oozie:shell-action:1.0"> +. + <xs:include schemaLocation="oozie-common-1.0.xsd"/> +. + <xs:element name="shell" type="shell:ACTION"/> +. + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:choice> + <xs:element name="job-tracker" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="resource-manager" type="xs:string" minOccurs="0" maxOccurs="1"/> + </xs:choice> + <xs:element name="name-node" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="prepare" type="shell:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="launcher" type="shell:LAUNCHER" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="configuration" type="shell:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:element name="exec" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="argument" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="env-var" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="capture-output" type="shell:FLAG" minOccurs="0" maxOccurs="1"/> + </xs:sequence> + </xs:complexType> +. +</xs:schema> +</pre></p> +<a name="Shell_Action_Schema_Version_0.3"></a> +</div> +<div class="section"><h5>Shell Action Schema Version 0.3</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:shell="uri:oozie:shell-action:0.3" elementFormDefault="qualified" + targetNamespace="uri:oozie:shell-action:0.3"> <xs:element name="shell" type="shell:ACTION"/> + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="job-tracker" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="name-node" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="prepare" type="shell:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="configuration" type="shell:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:element name="exec" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="argument" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="env-var" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="capture-output" type="shell:FLAG" minOccurs="0" maxOccurs="1"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="FLAG"/> + <xs:complexType name="CONFIGURATION"> + <xs:sequence> + <xs:element name="property" minOccurs="1" maxOccurs="unbounded"> + <xs:complexType> + <xs:sequence> + <xs:element name="name" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="value" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="description" minOccurs="0" maxOccurs="1" type="xs:string"/> + </xs:sequence> + </xs:complexType> + </xs:element> + </xs:sequence> + </xs:complexType> + <xs:complexType name="PREPARE"> + <xs:sequence> + <xs:element name="delete" type="shell:DELETE" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="mkdir" type="shell:MKDIR" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="DELETE"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> + <xs:complexType name="MKDIR"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> +</xs:schema> +</pre> +</p> +<a name="Shell_Action_Schema_Version_0.2"></a> +</div> +<div class="section"><h5>Shell Action Schema Version 0.2</h5> +<p><pre> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:shell="uri:oozie:shell-action:0.2" elementFormDefault="qualified" + targetNamespace="uri:oozie:shell-action:0.2"> <xs:element name="shell" type="shell:ACTION"/> + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="prepare" type="shell:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="configuration" type="shell:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:element name="exec" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="argument" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="env-var" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="capture-output" type="shell:FLAG" minOccurs="0" maxOccurs="1"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="FLAG"/> + <xs:complexType name="CONFIGURATION"> + <xs:sequence> + <xs:element name="property" minOccurs="1" maxOccurs="unbounded"> + <xs:complexType> + <xs:sequence> + <xs:element name="name" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="value" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="description" minOccurs="0" maxOccurs="1" type="xs:string"/> + </xs:sequence> + </xs:complexType> + </xs:element> + </xs:sequence> + </xs:complexType> + <xs:complexType name="PREPARE"> + <xs:sequence> + <xs:element name="delete" type="shell:DELETE" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="mkdir" type="shell:MKDIR" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="DELETE"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> + <xs:complexType name="MKDIR"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> +</xs:schema> +</pre> +</p> +<a name="Shell_Action_Schema_Version_0.1"></a> +</div> +<div class="section"><h5>Shell Action Schema Version 0.1</h5> +<p><pre> +<?xml version="1.0" encoding="UTF-8"?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" + xmlns:shell="uri:oozie:shell-action:0.1" elementFormDefault="qualified" + targetNamespace="uri:oozie:shell-action:0.1"> + <xs:element name="shell" type="shell:ACTION"/> + <xs:complexType name="ACTION"> + <xs:sequence> + <xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="prepare" type="shell:PREPARE" minOccurs="0" maxOccurs="1"/> + <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="1"/> + <xs:element name="configuration" type="shell:CONFIGURATION" minOccurs="0" maxOccurs="1"/> + <xs:element name="exec" type="xs:string" minOccurs="1" maxOccurs="1"/> + <xs:element name="argument" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="env-var" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="capture-output" type="shell:FLAG" minOccurs="0" maxOccurs="1"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="FLAG"/> + <xs:complexType name="CONFIGURATION"> + <xs:sequence> + <xs:element name="property" minOccurs="1" maxOccurs="unbounded"> + <xs:complexType> + <xs:sequence> + <xs:element name="name" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="value" minOccurs="1" maxOccurs="1" type="xs:string"/> + <xs:element name="description" minOccurs="0" maxOccurs="1" type="xs:string"/> + </xs:sequence> + </xs:complexType> + </xs:element> + </xs:sequence> + </xs:complexType> + <xs:complexType name="PREPARE"> + <xs:sequence> + <xs:element name="delete" type="shell:DELETE" minOccurs="0" maxOccurs="unbounded"/> + <xs:element name="mkdir" type="shell:MKDIR" minOccurs="0" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + <xs:complexType name="DELETE"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> + <xs:complexType name="MKDIR"> + <xs:attribute name="path" type="xs:string" use="required"/> + </xs:complexType> +</xs:schema> +</pre> +</p> +<p><a href="./index.html">::Go back to Oozie Documentation Index::</a> +</p> +<p></p> +</div> +</div> +</div> + + </div> + </div> + </div> + + <hr/> + + <footer> + <div class="container-fluid"> + <div class="row-fluid"> + <p >Copyright © 2018 + <a href="http://www.apache.org">Apache Software Foundation</a>. + All rights reserved. + + </p> + </div> + + + </div> + </footer> + </body> +</html>
