http://git-wip-us.apache.org/repos/asf/oozie/blob/4e5b3cb5/docs/src/site/twiki/AG_Monitoring.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/AG_Monitoring.twiki b/docs/src/site/twiki/AG_Monitoring.twiki index 425f554..4877dfc 100644 --- a/docs/src/site/twiki/AG_Monitoring.twiki +++ b/docs/src/site/twiki/AG_Monitoring.twiki @@ -1,33 +1,34 @@ -<noautolink> -[[index][::Go back to Oozie Documentation Index::]] ----+!! Oozie Monitoring +[::Go back to Oozie Documentation Index::](index.html) -%TOC% +# Oozie Monitoring ----++ Oozie Instrumentation +<!-- MACRO{toc|fromDepth=1|toDepth=4} --> + +## Oozie Instrumentation Oozie code is instrumented in several places to collect runtime metrics. The instrumentation data can be used to determine the health of the system, performance of the system, and to tune the system. This comes in two flavors: + * metrics (by default enabled since 5.0.0) * instrumentation (deprecated and by default disabled since 5.0.0) -The instrumentation is accessible via the Admin web-services API (see the [[WebServicesAPI#Oozie_Metrics][metrics]] and -[[WebServicesAPI#Oozie_Instrumentation][instrumentation]] Web Services API documentations for more details) and is also written on +The instrumentation is accessible via the Admin web-services API (see the [metrics](WebServicesAPI.html#Oozie_Metrics) and +[instrumentation](WebServicesAPI.html#Oozie_Instrumentation) Web Services API documentations for more details) and is also written on regular intervals to an instrumentation log. Instrumentation data includes variables, samplers, timers and counters. ----+++ Variables +### Variables * oozie * version: Oozie build version. * configuration - * config.dir: directory from where the configuration files are loaded. If null, all configuration files are loaded from the classpath. [[AG_Install#Oozie_Configuration][Configuration files are described here]]. + * config.dir: directory from where the configuration files are loaded. If null, all configuration files are loaded from the classpath. [Configuration files are described here](AG_Install.html#Oozie_Configuration). * config.file: the Oozie custom configuration for the instance. * jvm @@ -43,7 +44,7 @@ Instrumentation data includes variables, samplers, timers and counters. * from.classpath: whether the config file has been read from the classpath or from the config directory. * reload.interval: interval at which the config file will be reloaded. 0 if the config file will never be reloaded, when loaded from the classpath is never reloaded. ----+++ Samplers - Poll data at a fixed interval (default 1 sec) and report an average utilization over a longer period of time (default 60 seconds). +### Samplers - Poll data at a fixed interval (default 1 sec) and report an average utilization over a longer period of time (default 60 seconds). Poll for data over fixed interval and generate an average over the time interval. Unless specified, all samplers in Oozie work on a 1 minute interval. @@ -64,14 +65,16 @@ Oozie work on a 1 minute interval. * requests * version ----+++ Counters - Maintain statistics about the number of times an event has occurred, for the running Oozie instance. The values are reset if the Oozie instance is restarted. +### Counters - Maintain statistics about the number of times an event has occurred, for the running Oozie instance. The values are reset if the Oozie instance is restarted. * action.executors - Counters related to actions. * [action_type]#action.[operation_performed] (start, end, check, kill) * [action_type]#ex.[exception_type] (transient, non-transient, error, failed) - * e.g. <verbatim> - ssh#action.end: 306 - ssh#action.start: 316 </verbatim> + * e.g. +``` +ssh#action.end: 306 +ssh#action.start: 316 +``` * callablequeue - count of events in various execution queues. * delayed.queued: Number of commands queued with a delay. @@ -113,7 +116,7 @@ Oozie work on a 1 minute interval. * version * version-GET ----+++ Timers - Maintain information about the time spent in various operations. +### Timers - Maintain information about the time spent in various operations. * action.executors - Counters related to actions. * [action_type]#action.[operation_performed] (start, end, check, kill) @@ -156,39 +159,41 @@ Oozie work on a 1 minute interval. * version * version-GET ----++ Oozie JVM Thread Dump - The =admin/jvminfo.jsp= servlet can be used to get some basic jvm stats and thread dump. -For eg: http://localhost:11000/oozie/admin/jvminfo.jsp?cpuwatch=1000&threadsort=cpu. It takes the following optional +## Oozie JVM Thread Dump +The `admin/jvminfo.jsp` servlet can be used to get some basic jvm stats and thread dump. +For eg: `http://localhost:11000/oozie/admin/jvminfo.jsp?cpuwatch=1000&threadsort=cpu`. It takes the following optional query parameters: + * threadsort - The order in which the threads are sorted for display. Valid values are name, cpu, state. Default is state. * cpuwatch - Time interval in milliseconds to monitor cpu usage of threads. Default value is 0. ----++ Monitoring Database Schema Integrity +## Monitoring Database Schema Integrity Oozie stores all of its state in a database. Hence, ensuring that the database schema is correct is very important to ensuring that -Oozie is healthy and behaves correctly. To help with this, Oozie includes a =SchemaCheckerService= which periodically runs and +Oozie is healthy and behaves correctly. To help with this, Oozie includes a `SchemaCheckerService` which periodically runs and performs a series of checks on the database schema. More specifically, it checks the following: + * Existence of the required tables * Existence of the required columns in each table * Each column has the correct type and default value * Existence of the required primary keys and indexes -After each run, the =SchemaCheckerService= writes the result of the checks to the Oozie log and to the "schema-checker.status" +After each run, the `SchemaCheckerService` writes the result of the checks to the Oozie log and to the "schema-checker.status" instrumentation variable. If there's a problem, it will be logged at the ERROR level, while correct checks are logged at the DEBUG level. -By default, the =SchemaCheckerService= runs every 7 days. This can be configured -by =oozie.service.SchemaCheckerService.check.interval= +By default, the `SchemaCheckerService` runs every 7 days. This can be configured +by `oozie.service.SchemaCheckerService.check.interval` -By default, the =SchemaCheckerService= will consider "extra" tables, columns, and indexes to be incorrect. Advanced users who have +By default, the `SchemaCheckerService` will consider "extra" tables, columns, and indexes to be incorrect. Advanced users who have added additional tables, columns, and indexes can tell Oozie to ignore these by -setting =oozie.service.SchemaCheckerService.ignore.extras= to =false=. +setting `oozie.service.SchemaCheckerService.ignore.extras` to `false`. -The =SchemaCheckerService= currently only supports MySQL, PostgreSQL, and Oracle databases. SQL Server and Derby are currently not +The `SchemaCheckerService` currently only supports MySQL, PostgreSQL, and Oracle databases. SQL Server and Derby are currently not supported. When Oozie HA is enabled, only one of the Oozie servers will perform the checks. -[[index][::Go back to Oozie Documentation Index::]] +[::Go back to Oozie Documentation Index::](index.html) + -</noautolink>
http://git-wip-us.apache.org/repos/asf/oozie/blob/4e5b3cb5/docs/src/site/twiki/AG_OozieLogging.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/AG_OozieLogging.twiki b/docs/src/site/twiki/AG_OozieLogging.twiki deleted file mode 100644 index ecdcfd3..0000000 --- a/docs/src/site/twiki/AG_OozieLogging.twiki +++ /dev/null @@ -1,87 +0,0 @@ -<noautolink> - -[[index][::Go back to Oozie Documentation Index::]] - ----+!! Oozie Logging - -%TOC% - ----++ Default Oozie Logging - -Oozie's logging properties can be configured in its log4j properties file (default is =oozie-log4j.properties=). Most log messages -are configured by default to be written to the =oozie= appender. - -The default configuration for the =oozie= appender is shown below. - ----+++ Default Configuration - -<verbatim> -log4j.appender.oozie=org.apache.log4j.rolling.RollingFileAppender -log4j.appender.oozie.RollingPolicy=org.apache.oozie.util.OozieRollingPolicy -log4j.appender.oozie.File=${oozie.log.dir}/oozie.log -log4j.appender.oozie.Append=true -log4j.appender.oozie.layout=org.apache.log4j.PatternLayout -log4j.appender.oozie.layout.ConversionPattern=%d{ISO8601} %5p %c{1}:%L - %m%n -log4j.appender.oozie.RollingPolicy.FileNamePattern=${log4j.appender.oozie.File}-%d{yyyy-MM-dd-HH} -log4j.appender.oozie.RollingPolicy.MaxHistory=720 -</verbatim> - -In this configuration, the active log file will be named =oozie.log= and all old log files will be named =oozie.log-yyyy-MM-dd-HH= -(where =yyyy-MM-dd-HH= is the time that that log file was created; e.g. 2012-07-21-05). All log files are in the same directory -(whatever =oozie.log.dir= is assigned to). A maximum of 720 older log files will be retained. The active log file is rolled every -hour, so 720 old logs means that they are kept for 30 days before being deleted. - -To keep all old logs instead of deleting them, =log4j.appender.oozie.RollingPolicy.MaxHistory= can be set to =-1=. -Additionally, =log4j.appender.oozie.RollingPolicy= can be set to =org.apache.log4j.rolling.TimeBasedRollingPolicy=, which has the -same exact behavior as =org.apache.oozie.util.OozieRollingPolicy= except that it does not delete old logs. - ----+++ Restrictions - -In order for Oozie logging to work 100% correctly, the following restrictions must be observed (described below and in -the =oozie-log4j.properties= file): - -* The appender that Oozie uses must be named "oozie" (i.e. =log4j.appender.oozie=) - -* =log4j.appender.oozie.RollingPolicy.FileNamePattern= must end with "-%d{yyyy-MM-dd-HH}.gz" or "-%d{yyyy-MM-dd-HH}". -If it ends with ".gz" the old logs will be compressed when rolled - -* =log4j.appender.oozie.RollingPolicy.FileNamePattern= must start with the value of =log4j.appender.oozie.File= - ----++ Previous Default Oozie Logging - -Oozie previously used the logging configuration shown below as the default for the =oozie= appender. The other appender that Oozie -writes to still use a configuration similar to this. - ----+++ Previous Default Configuration - -<verbatim> -log4j.appender.oozie=org.apache.log4j.DailyRollingFileAppender -log4j.appender.oozie.File=${oozie.log.dir}/oozie.log -log4j.appender.oozie.Append=true -log4j.appender.oozie.layout=org.apache.log4j.PatternLayout -log4j.appender.oozie.layout.ConversionPattern=%d{ISO8601} %5p %c{1}:%L - %m%n -log4j.appender.oozie.DatePattern='.'yyyy-MM-dd-HH -</verbatim> - -In this configuration, the active log file will be named =oozie.log= and all old log files will be named =oozie.log.yyyy-MM-dd-HH= -(where =yyyy-MM-dd-HH= is the time that the log file was created; e.g. 2012-07-21-05). All log files are in the same directory -(whatever =oozie.log.dir= is assigned to). All older log files are retained. The active log file is rolled every hour. - ----+++ Restrictions - -In order for Oozie logging to work 100% correctly, the following restrictions must be observed (described below and in the -=oozie-log4j.properties= file): - -* The appender that Oozie uses must be named "oozie" (i.e. =log4j.appender.oozie=) - -* =log4j.appender.oozie.DatePattern= must end with either "dd" or "HH". If it ends with "HH", the log will be rolled every hour; -if it ends with "dd", the log will be rolled every day. - ----++ Other Oozie Logging - -While Oozie can technically use any valid log4j Appender or configurations that violate the above restrictions, certain -features related to logs may be disabled and/or not work correctly, and is thus not advised. - -[[index][::Go back to Oozie Documentation Index::]] - -</noautolink> http://git-wip-us.apache.org/repos/asf/oozie/blob/4e5b3cb5/docs/src/site/twiki/AG_OozieUpgrade.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/AG_OozieUpgrade.twiki b/docs/src/site/twiki/AG_OozieUpgrade.twiki index 3024064..ec24011 100644 --- a/docs/src/site/twiki/AG_OozieUpgrade.twiki +++ b/docs/src/site/twiki/AG_OozieUpgrade.twiki @@ -1,12 +1,12 @@ -<noautolink> -[[index][::Go back to Oozie Documentation Index::]] ----+!! Oozie Upgrade +[::Go back to Oozie Documentation Index::](index.html) -%TOC% +# Oozie Upgrade ----+ Preparation +<!-- MACRO{toc|fromDepth=1|toDepth=4} --> + +## Preparation Make sure there are not Workflows in RUNNING or SUSPENDED status, otherwise the database upgrade will fail. @@ -14,29 +14,30 @@ Shutdown Oozie and backup the Oozie database. Copy the oozie-site.xml from your current setup. ----+ Oozie Server Upgrade +## Oozie Server Upgrade Expand the new Oozie tarball in a new location. -Edit the new =oozie-site.xml= setting all custom properties values from the old =oozie-site.xml= +Edit the new `oozie-site.xml` setting all custom properties values from the old `oozie-site.xml` IMPORTANT: From Oozie 2.x to Oozie 3.x the names of the database configuration properties have -changed. Their prefix has changed from =oozie.service.StoreService.*= to =oozie.service.JPAService.*=. +changed. Their prefix has changed from `oozie.service.StoreService.*` to `oozie.service.JPAService.*`. Make sure you are using the new prefix. -After upgrading the Oozie server, the =oozie-setup.sh= MUST be rerun before starting the +After upgrading the Oozie server, the `oozie-setup.sh` MUST be rerun before starting the upgraded Oozie server. Oozie database migration is required when there Oozie database schema changes, like upgrading from Oozie 2.x to Oozie 3.x. Configure the oozie-site.xml with the correct database configuration properties as -explained in the 'Database Configuration' section in [[AG_Install][Oozie Install]]. +explained in the 'Database Configuration' section in [Oozie Install](AG_Install.html). -Once =oozie-site.xml= has been configured with the database configuration execute the =ooziedb.sh= +Once `oozie-site.xml` has been configured with the database configuration execute the `ooziedb.sh` command line tool to upgrade the database: -<verbatim> + +``` $ bin/ooziedb.sh upgrade -run Validate DB Connection. @@ -64,22 +65,22 @@ Oozie DB has been upgraded to Oozie version '3.2.0' The SQL commands have been written to: /tmp/ooziedb-5737263881793872034.sql $ -</verbatim> +``` The new version of the Oozie server is ready to be started. -NOTE: If using MySQL or Oracle, copy the corresponding JDBC driver JAR file to the =libext/= directory before running -the =ooziedb.sh= command line tool. +NOTE: If using MySQL or Oracle, copy the corresponding JDBC driver JAR file to the `libext/` directory before running +the `ooziedb.sh` command line tool. -NOTE: If instead using the '-run' option, the '-sqlfile <FILE>' option is used, then all the +NOTE: If instead using the '-run' option, the `-sqlfile <FILE>` option is used, then all the database changes will be written to the specified file and the database won't be modified. ----+ Oozie Client Upgrade +## Oozie Client Upgrade While older Oozie clients work with newer Oozie server, to have access to all the functionality of the Oozie server the same version of Oozie client should be installed and used by users. -[[index][::Go back to Oozie Documentation Index::]] +[::Go back to Oozie Documentation Index::](index.html) + -</noautolink> http://git-wip-us.apache.org/repos/asf/oozie/blob/4e5b3cb5/docs/src/site/twiki/BundleFunctionalSpec.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/BundleFunctionalSpec.twiki b/docs/src/site/twiki/BundleFunctionalSpec.twiki index 9749df5..5301429 100644 --- a/docs/src/site/twiki/BundleFunctionalSpec.twiki +++ b/docs/src/site/twiki/BundleFunctionalSpec.twiki @@ -1,110 +1,111 @@ -<noautolink> -[[index][::Go back to Oozie Documentation Index::]] + +[::Go back to Oozie Documentation Index::](index.html) ----- ----+!! Oozie Bundle Specification +# Oozie Bundle Specification The goal of this document is to define a new oozie abstraction called bundle system specialized in submitting and maintaining a set of coordinator applications. -%TOC% +<!-- MACRO{toc|fromDepth=1|toDepth=4} --> ----++ Changelog +## Changelog ----++ 1. Bundle Overview +## 1. Bundle Overview Bundle is a higher-level oozie abstraction that will batch a set of coordinator applications. The user will be able to start/stop/suspend/resume/rerun in the bundle level resulting a better and easy operational control. -More specifically, the oozie *Bundle* system allows the user to define and execute a bunch of coordinator applications often called a data pipeline. There is no explicit dependency among the coordinator applications in a bundle. However, a user could use the data dependency of coordinator applications to create an implicit data application pipeline. +More specifically, the oozie **Bundle** system allows the user to define and execute a bunch of coordinator applications often called a data pipeline. There is no explicit dependency among the coordinator applications in a bundle. However, a user could use the data dependency of coordinator applications to create an implicit data application pipeline. ----++ 2. Definitions +## 2. Definitions -*Kick-off-time:* The time when a bundle should start and submit coordinator applications. +**Kick-off-time:** The time when a bundle should start and submit coordinator applications. -*Bundle Application:* A bundle application defines a set of coordinator applications and when to start those. Normally, bundle applications are parameterized. A bundle application is written in XML. +**Bundle Application:** A bundle application defines a set of coordinator applications and when to start those. Normally, bundle applications are parameterized. A bundle application is written in XML. -*Bundle Job:* A bundle job is an executable instance of a bundle application. A job submission is done by submitting a job configuration that resolves all parameters in the application definition. +**Bundle Job:** A bundle job is an executable instance of a bundle application. A job submission is done by submitting a job configuration that resolves all parameters in the application definition. -*Bundle Definition Language:* The language used to describe bundle applications. +**Bundle Definition Language:** The language used to describe bundle applications. ----++ 3. Expression Language for Parameterization +## 3. Expression Language for Parameterization Bundle application definitions can be parameterized with variables. At job submission time all the parameters are resolved into concrete values. -The parameterization of bundle definitions is done using JSP Expression Language syntax from the [[http://jcp.org/aboutJava/communityprocess/final/jsr152/][JSP 2.0 Specification (JSP.2.3)]], allowing not only to support variables as parameters but also complex expressions. +The parameterization of bundle definitions is done using JSP Expression Language syntax from the [JSP 2.0 Specification (JSP.2.3)](http://jcp.org/aboutJava/communityprocess/final/jsr152/index.html), allowing not only to support variables as parameters but also complex expressions. EL expressions can be used in XML attribute values and XML text element values. They cannot be used in XML element and XML attribute names. ----++ 4. Bundle Job +## 4. Bundle Job ----+++ 4.1. Bundle Job Status +### 4.1. Bundle Job Status -At any time, a bundle job is in one of the following status: *PREP, RUNNING, RUNNINGWITHERROR, SUSPENDED, PREPSUSPENDED, SUSPENDEDWITHERROR, PAUSED, PAUSEDWITHERROR, PREPPAUSED, SUCCEEDED, DONEWITHERROR, KILLED, FAILED*. +At any time, a bundle job is in one of the following status: **PREP, RUNNING, RUNNINGWITHERROR, SUSPENDED, PREPSUSPENDED, SUSPENDEDWITHERROR, PAUSED, PAUSEDWITHERROR, PREPPAUSED, SUCCEEDED, DONEWITHERROR, KILLED, FAILED**. ----+++ 4.2. Transitions of Bundle Job Status +### 4.2. Transitions of Bundle Job Status Valid bundle job status transitions are: - * *PREP --> PREPSUSPENDED | PREPPAUSED | RUNNING | KILLED* - * *RUNNING --> RUNNINGWITHERROR | SUSPENDED | PAUSED | SUCCEEDED | KILLED* - * *RUNNINGWITHERROR --> RUNNING | SUSPENDEDWITHERROR | PAUSEDWITHERROR | DONEWITHERROR | FAILED | KILLED* - * *PREPSUSPENDED --> PREP | KILLED* - * *SUSPENDED --> RUNNING | KILLED* - * *SUSPENDEDWITHERROR --> RUNNINGWITHERROR | KILLED* - * *PREPPAUSED --> PREP | KILLED* - * *PAUSED --> SUSPENDED | RUNNING | KILLED* - * *PAUSEDWITHERROR --> SUSPENDEDWITHERROR | RUNNINGWITHERROR | KILLED* + * **PREP --> PREPSUSPENDED | PREPPAUSED | RUNNING | KILLED** + * **RUNNING --> RUNNINGWITHERROR | SUSPENDED | PAUSED | SUCCEEDED | KILLED** + * **RUNNINGWITHERROR --> RUNNING | SUSPENDEDWITHERROR | PAUSEDWITHERROR | DONEWITHERROR | FAILED | KILLED** + * **PREPSUSPENDED --> PREP | KILLED** + * **SUSPENDED --> RUNNING | KILLED** + * **SUSPENDEDWITHERROR --> RUNNINGWITHERROR | KILLED** + * **PREPPAUSED --> PREP | KILLED** + * **PAUSED --> SUSPENDED | RUNNING | KILLED** + * **PAUSEDWITHERROR --> SUSPENDEDWITHERROR | RUNNINGWITHERROR | KILLED** ----+++ 4.3. Details of Status Transitions -When a bundle job is submitted, oozie parses the bundle job XML. Oozie then creates a record for the bundle with status *PREP* and returns a unique ID. +### 4.3. Details of Status Transitions +When a bundle job is submitted, oozie parses the bundle job XML. Oozie then creates a record for the bundle with status **PREP** and returns a unique ID. -When a user requests to suspend a bundle job that is in *PREP* state, oozie puts the job in status *PREPSUSPENDED*. Similarly, when pause time reaches for a bundle job with *PREP* status, oozie puts the job in status *PREPPAUSED*. +When a user requests to suspend a bundle job that is in **PREP** state, oozie puts the job in status **PREPSUSPENDED**. Similarly, when pause time reaches for a bundle job with **PREP** status, oozie puts the job in status **PREPPAUSED**. -Conversely, when a user requests to resume a *PREPSUSPENDED* bundle job, oozie puts the job in status *PREP*. And when pause time is reset for a bundle job that is in *PREPPAUSED* state, oozie puts the job in status *PREP*. +Conversely, when a user requests to resume a **PREPSUSPENDED** bundle job, oozie puts the job in status **PREP**. And when pause time is reset for a bundle job that is in **PREPPAUSED** state, oozie puts the job in status **PREP**. There are two ways a bundle job could be started. - * If =kick-off-time= (defined in the bundle xml) reaches. The default value is null which means starts coordinators NOW. +* If `kick-off-time` (defined in the bundle xml) reaches. The default value is null which means starts coordinators NOW. + +* If user sends a start request to START the bundle. - * If user sends a start request to START the bundle. +When a bundle job starts, oozie puts the job in status **RUNNING** and it submits all the coordinator jobs. If any coordinator job goes to **FAILED/KILLED/DONEWITHERROR** state, the bundle job is put in **RUNNINGWITHERROR** -When a bundle job starts, oozie puts the job in status *RUNNING* and it submits all the coordinator jobs. If any coordinator job goes to *FAILED/KILLED/DONEWITHERROR* state, the bundle job is put in *RUNNINGWITHERROR* +When a user requests to kill a bundle job, oozie puts the job in status **KILLED** and it sends kill to all submitted coordinator jobs. -When a user requests to kill a bundle job, oozie puts the job in status *KILLED* and it sends kill to all submitted coordinator jobs. +When a user requests to suspend a bundle job that is in **RUNNING** status, oozie puts the job in status **SUSPENDED** and it suspends all submitted coordinator jobs. Similarly, when a user requests to suspend a bundle job that is in **RUNNINGWITHERROR** status, oozie puts the job in status **SUSPENDEDWITHERROR** and it suspends all submitted coordinator jobs. -When a user requests to suspend a bundle job that is in *RUNNING* status, oozie puts the job in status *SUSPENDED* and it suspends all submitted coordinator jobs. Similarly, when a user requests to suspend a bundle job that is in *RUNNINGWITHERROR* status, oozie puts the job in status *SUSPENDEDWITHERROR* and it suspends all submitted coordinator jobs. +When pause time reaches for a bundle job that is in **RUNNING** status, oozie puts the job in status **PAUSED**. When pause time reaches for a bundle job that is in **RUNNINGWITHERROR** status, oozie puts the job in status **PAUSEDWITHERROR**. -When pause time reaches for a bundle job that is in *RUNNING* status, oozie puts the job in status *PAUSED*. When pause time reaches for a bundle job that is in *RUNNINGWITHERROR* status, oozie puts the job in status *PAUSEDWITHERROR*. +Conversely, when a user requests to resume a **SUSPENDED** bundle job, oozie puts the job in status **RUNNING**. Similarly, when a user requests to resume a **SUSPENDEDWITHERROR** bundle job, oozie puts the job in status **RUNNINGWITHERROR**. And when pause time is reset for a bundle job and job status is **PAUSED**, oozie puts the job in status **RUNNING**. Similarly, when the pause time is reset for a bundle job and job status is **PAUSEDWITHERROR**, oozie puts the job in status **RUNNINGWITHERROR** -Conversely, when a user requests to resume a *SUSPENDED* bundle job, oozie puts the job in status *RUNNING*. Similarly, when a user requests to resume a *SUSPENDEDWITHERROR* bundle job, oozie puts the job in status *RUNNINGWITHERROR*. And when pause time is reset for a bundle job and job status is *PAUSED*, oozie puts the job in status *RUNNING*. Similarly, when the pause time is reset for a bundle job and job status is *PAUSEDWITHERROR*, oozie puts the job in status *RUNNINGWITHERROR* +When all the coordinator jobs finish, oozie updates the bundle status accordingly. If all coordinators reaches to the _same_ terminal state, bundle job status also move to the same status. For example, if all coordinators are **SUCCEEDED**, oozie puts the bundle job into **SUCCEEDED** status. However, if all coordinator jobs don't finish with the same status, oozie puts the bundle job into **DONEWITHERROR**. -When all the coordinator jobs finish, oozie updates the bundle status accordingly. If all coordinators reaches to the _same_ terminal state, bundle job status also move to the same status. For example, if all coordinators are *SUCCEEDED*, oozie puts the bundle job into *SUCCEEDED* status. However, if all coordinator jobs don't finish with the same status, oozie puts the bundle job into *DONEWITHERROR*. - ----+++ 4.3. Bundle Application Definition +### 4.3. Bundle Application Definition A bundle definition is defined in XML by a name, controls and one or more coordinator application specifications: - * *%BLUE% name: %ENDCOLOR%* The name for the bundle job. - * *%BLUE% controls: %ENDCOLOR%* The control specification for the bundle. - * *%BLUE% kick-off-time: %ENDCOLOR%* It defines when the bundle job should start and submit the coordinator applications. This field is optional and the default is *NOW* that means the job should start right-a-way. - * *%BLUE% coordinator: %ENDCOLOR%* Coordinator application specification. There should be at least one coordinator application in any bundle. - * *%BLUE% name: %ENDCOLOR%* Name of the coordinator application. It can be used for referring this application through bundle to control such as kill, suspend, rerun. - * *%BLUE% enabled: %ENDCOLOR%* Enabled can be used to enable or disable a coordinator. It is optional. The default value for enabled is true. - * *%BLUE% app-path: %ENDCOLOR%* Path of the coordinator application definition in hdfs. This is a mandatory element. - * *%BLUE% configuration: %ENDCOLOR%* A hadoop like configuration to parameterize corresponding coordinator application. This is optional. - * *%BLUE% Parameterization: %ENDCOLOR%* Configuration properties that are a valid Java identifier, [A-Za-z_][0-9A-Za-z_]*, are available as =${NAME}= variables within the bundle application definition. Configuration properties that are not a valid Java identifier, for example =job.tracker=, are available via the =${bundle:conf(String name)}= function. Valid Java identifier properties are available via this function as well. + * **<font color="#0000ff"> name: </font>** The name for the bundle job. + * **<font color="#0000ff"> controls: </font>** The control specification for the bundle. + * **<font color="#0000ff"> kick-off-time: </font>** It defines when the bundle job should start and submit the coordinator applications. This field is optional and the default is **NOW** that means the job should start right-a-way. + * **<font color="#0000ff"> coordinator: </font>** Coordinator application specification. There should be at least one coordinator application in any bundle. + * **<font color="#0000ff"> name: </font>** Name of the coordinator application. It can be used for referring this application through bundle to control such as kill, suspend, rerun. + * **<font color="#0000ff"> enabled: </font>** Enabled can be used to enable or disable a coordinator. It is optional. The default value for enabled is true. + * **<font color="#0000ff"> app-path: </font>** Path of the coordinator application definition in hdfs. This is a mandatory element. + * **<font color="#0000ff"> configuration: </font>** A hadoop like configuration to parameterize corresponding coordinator application. This is optional. + * **<font color="#0000ff"> Parameterization: </font>** Configuration properties that are a valid Java identifier, [A-Za-z_][0-9A-Za-z_]*, are available as `${NAME}` variables within the bundle application definition. Configuration properties that are not a valid Java identifier, for example `job.tracker`, are available via the `${bundle:conf(String name)}` function. Valid Java identifier properties are available via this function as well. + +**<font color="#800080">Syntax: </font>** -*%PURPLE% Syntax: %ENDCOLOR%* -<verbatim> - <bundle-app name=[NAME] xmlns='uri:oozie:bundle:0.1'> +``` + <bundle-app name=[NAME] xmlns='uri:oozie:bundle:0.1'> <controls> <kick-off-time>[DATETIME]</kick-off-time> </controls> @@ -119,16 +120,17 @@ A bundle definition is defined in XML by a name, controls and one or more coordi </configuration> </coordinator> ... -</bundle-app> -</verbatim> +</bundle-app> +``` + +**<font color="#008000"> Examples: </font>** -*%GREEN% Examples: %ENDCOLOR%* +**A Bundle Job that maintains two coordinator applications:** -*A Bundle Job that maintains two coordinator applications:* -<verbatim> -<bundle-app name='APPNAME' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='uri:oozie:bundle:0.1'> +``` +<bundle-app name='APPNAME' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='uri:oozie:bundle:0.1'> <controls> <kick-off-time>${kickOffTime}</kick-off-time> </controls> @@ -159,18 +161,19 @@ A bundle definition is defined in XML by a name, controls and one or more coordi </configuration> </coordinator> </bundle-app> -</verbatim> +``` ----+++ 4.4. Bundle Formal Parameters -As of schema 0.2, a list of formal parameters can be provided which will allow Oozie to verify, at submission time, that said -properties are actually specified (i.e. before the job is executed and fails). Default values can also be provided. +### 4.4. Bundle Formal Parameters +As of schema 0.2, a list of formal parameters can be provided which will allow Oozie to verify, at submission time, that said +properties are actually specified (i.e. before the job is executed and fails). Default values can also be provided. -*Example:* +**Example:** The previous Bundle Job application definition with formal parameters: -<verbatim> -<bundle-app name='APPNAME' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='uri:oozie:bundle:0.2'> + +``` +<bundle-app name='APPNAME' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='uri:oozie:bundle:0.2'> <parameters> <property> <name>appPath</name> @@ -210,17 +213,17 @@ The previous Bundle Job application definition with formal parameters: </configuration> </coordinator> </bundle-app> -</verbatim> +``` -In the above example, if =appPath= is not specified, Oozie will print an error message instead of submitting the job. If -=appPath2= is not specified, Oozie will use the default value, =hdfs://foo:8020/user/joe/job/job.properties=. +In the above example, if `appPath` is not specified, Oozie will print an error message instead of submitting the job. If +`appPath2` is not specified, Oozie will use the default value, `hdfs://foo:8020/user/joe/job/job.properties`. ----++ 5. User Propagation +## 5. User Propagation -When submitting a bundle job, the configuration must contain a =user.name= property. If security is enabled, Oozie must ensure that the value of the =user.name= property in the configuration match the user credentials present in the protocol (web services) request. +When submitting a bundle job, the configuration must contain a `user.name` property. If security is enabled, Oozie must ensure that the value of the `user.name` property in the configuration match the user credentials present in the protocol (web services) request. -When submitting a bundle job, the configuration may contain the =oozie.job.acl= property (the =group.name= property +When submitting a bundle job, the configuration may contain the `oozie.job.acl` property (the `group.name` property has been deprecated). If authorization is enabled, this property is treated as as the ACL for the job, it can contain user and group IDs separated by commas. @@ -228,15 +231,15 @@ The specified user and ACL are assigned to the created bundle job. Oozie must propagate the specified user and ACL to the system executing its children jobs (coordinator jobs). ----++ 6. Bundle Application Deployment +## 6. Bundle Application Deployment A bundle application consist exclusively of bundle application definition and associated coordinator application specifications. They must be installed in an HDFS directory. To submit a job for a bundle application, the full HDFS path to bundle application definition must be specified. ----+++ 6.1. Organizing Bundle Applications +### 6.1. Organizing Bundle Applications TBD. ----++ 7. Bundle Job Submission +## 7. Bundle Job Submission When a bundle job is submitted to Oozie, the submitter must specified all the required job properties plus the HDFS path to the bundle application definition for the job. @@ -245,9 +248,10 @@ The bundle application definition HDFS path must be specified in the 'oozie.bund All the bundle job properties, the HDFS path for the bundle application, the 'user.name' and 'oozie.job.acl' must be submitted to the Oozie using an XML configuration file (Hadoop XML configuration file). -*%GREEN% Example: %ENDCOLOR%*: +**<font color="#008000"> Example: </font>**: -<verbatim> + +``` <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> @@ -260,23 +264,24 @@ submitted to the Oozie using an XML configuration file (Hadoop XML configuration </property> ... </configuration> -</verbatim> +``` ----++ 8. Bundle Rerun ----+++ Rerunning a Bundle Job +## 8. Bundle Rerun +### 8.1 Rerunning a Bundle Job Oozie provides a way of rerunning a bundle job. The user could request to rerun a subset of coordinators within a bundle by defining a list of coordinator's names. In addition, a user could define a list of dates or ranges of dates (in UTC format) to rerun for those time windows. -There is a way of asking whether to cleanup all output directories before rerun. By default, oozie will remove all output directories. Moreover, there is an option by which a user could ask to re-calculate the dynamic input directories defined by latest function in coordinators. +There is a way of asking whether to cleanup all output directories before rerun. By default, oozie will remove all output directories. Moreover, there is an option by which a user could ask to re-calculate the dynamic input directories defined by latest function in coordinators. + +### 8.2 Rerun Arguments ----+++ Rerun Arguments -<verbatim> +``` $oozie job -rerun <bundle_Job_id> [-coordinator <list of coordinator name separate by comma> [-date 2009-01-01T01:00Z::2009-05-31T23:59Z, 2009-11-10T01:00Z, 2009-12-31T22:00Z] [-nocleanup] [-refresh] -</verbatim> +``` - * The =rerun= option reruns a bundle job that is *not* in (=KILLED=, =FAILED=, =PREP=, =PREPPAUSED=, =PREPSUSPENDED=). - * Rerun a bundle job that is in =PAUSED= state will reset the paused time. + * The `rerun` option reruns a bundle job that is *not* in (`KILLED`, `FAILED`, `PREP`, `PREPPAUSED`, `PREPSUSPENDED`). + * Rerun a bundle job that is in `PAUSED` state will reset the paused time. * The option -coordinator determines the name of coordinator that will be rerun. By default all coordinators are rerun. * Multiple ranges can be used in -date. See the above examples. * The dates specified in -date must be UTC. @@ -285,18 +290,19 @@ $oozie job -rerun <bundle_Job_id> [-coordinator <list of coordinator name separa * If -refresh is set, all dependencies will be re-checked; otherwise only missed dependencies will be checked for the corresponding coordinators. -After the command is executed the rerun bundle job will be in =RUNNING= status. +After the command is executed the rerun bundle job will be in `RUNNING` status. + +Refer to the [Rerunning Coordinator Actions](DG_CoordinatorRerun.html) for details on rerun of coordinator job. -Refer to the [[DG_CoordinatorRerun][Rerunning Coordinator Actions]] for details on rerun of coordinator job. +## Appendixes ----++ Appendixes +### Appendix A, Oozie Bundle XML-Schema ----+++ Appendix A, Oozie Bundle XML-Schema +#### Oozie Bundle Schema 0.1 ----++++ Oozie Bundle Schema 0.1 -<verbatim> +``` <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:bundle="uri:oozie:bundle:0.1" elementFormDefault="qualified" targetNamespace="uri:oozie:bundle:0.1"> @@ -340,11 +346,12 @@ Refer to the [[DG_CoordinatorRerun][Rerunning Coordinator Actions]] for details </xs:sequence> </xs:complexType> </xs:schema> -</verbatim> +``` ----++++ Oozie Bundle Schema 0.2 +#### Oozie Bundle Schema 0.2 -<verbatim> + +``` <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:bundle="uri:oozie:bundle:0.2" elementFormDefault="qualified" targetNamespace="uri:oozie:bundle:0.2"> @@ -403,9 +410,9 @@ Refer to the [[DG_CoordinatorRerun][Rerunning Coordinator Actions]] for details </xs:sequence> </xs:complexType> </xs:schema> -</verbatim> +``` + +[::Go back to Oozie Documentation Index::](index.html) -[[index][::Go back to Oozie Documentation Index::]] -</noautolink>
