[jira] [Commented] (OOZIE-2445) Doc for - Specifying coordinator input datasets in more logical ways (OOZIE-1976)

Purshotam Shah (JIRA) Thu, 09 Jun 2016 11:43:50 -0700

    [ 
https://issues.apache.org/jira/browse/OOZIE-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323072#comment-15323072
 ]


Purshotam Shah commented on OOZIE-2445:
---------------------------------------

It's just a doc change, verified patch locally.

{code}
********Test tabs.....
---------------
********New Test tabs.....
---------------
*******Test trailing spaces.....
       0
---------------
*******Test lines greater than 132
+   * COMBINE :  With combine, instances of A and B can be interleaved to get 
the final "combined" set of total instances. All datasets in combine should 
have the same range defined with the current EL function. Combine does not 
support latest and future EL functions. Combine cannot also be nested.
+   * *%BLUE% WAIT (in minutes): %ENDCOLOR%* If all dependencies are not met, 
and MIN dependencies are met,  then Oozie will keep on waiting for more 
instances till wait time elapses or all dependent data are available.
+The conditional logic can be specified using the <input-logic> tag in the 
coordinator.xml using the 
[[CoordinatorFunctionalSpec#Oozie_Coordinator_Schema_0.5][Oozie Coordinator 
Schema 0.5]] and above. If not specified, the default behavior of "AND" of all 
defined input dependencies is applied.
+Order of definition of the dataset matters. Availability of inputs is checked 
in that order. Only if input instances of the first dataset is not available, 
then the input instances of the second dataset will be checked and so on. In 
the case of AND or OR, the second dataset is picked only if the first dataset 
does not meet all the input dependencies first. In the case of COMBINE, only 
the input instances missing on the first dataset are checked for availability 
on the other datasets in order and then included.
+coord:dataIn() function can be used to get the comma separated list of 
evaluated hdfs paths given the name of the conditional operator.
+With above expression one can specify the dataset as AorB. Action will start 
running as soon dataset A or B is available. Dataset "A" has higher precedence 
over "B" because it is defined first. Oozie will first check for availability 
of dataset A and only if A is not available, availability of dataset B will be 
checked.
+            
<uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/input-data/rawLogs/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}</uri-template>
+            
<uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/input-data/rawLogs-2/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}</uri-template>
+After the mininum two dependencies are available, processing will wait for 
additional 10 minutes to include any dependencies that become available during 
that period.
+MIN and WAIT can be used at parent level, which will get propagated to child 
node. Above expression is equivalent to dataset A with min = 2 and wait = 10 
minutes and dataset B with min = 2 and wait = 10 minutes.
+            
<uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/input-data/rawLogs/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}</uri-template>
+            
<uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/input-data/rawLogs-2/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}</uri-template>
+            
<uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/output-data/inputLogic/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template>
13
{code}

Test lines greater than 132 are from twiki.

> Doc for -  Specifying coordinator input datasets in more logical ways 
> (OOZIE-1976)
> ----------------------------------------------------------------------------------
>
>                 Key: OOZIE-2445
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2445
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Purshotam Shah
>            Assignee: Purshotam Shah
>         Attachments: CoordinatorFunctionalSpec.html, OOZIE-2445-V2.patch, 
> OOZIE-2445-V3.patch, OOZIE-2445-V4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OOZIE-2445) Doc for - Specifying coordinator input datasets in more logical ways (OOZIE-1976)

Reply via email to