[ 
https://issues.apache.org/jira/browse/OOZIE-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193422#comment-13193422
 ] 

[email protected] commented on OOZIE-636:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3486/
-----------------------------------------------------------

(Updated 2012-01-25 23:00:15.001704)


Review request for oozie, Mohammad Islam and Angelo K. Huang.


Changes
-------

Integrating Alejandro's and Santhosh's comments. Thanks! 

Please review the code. In the meanwhile, will add few more test cases.


Summary
-------

Validate fork and join at wf submission time
https://issues.apache.org/jira/browse/OOZIE-636

Brief description of algo:

A modified dfs algorithm is used. Two stacks, one for dfs traversal and other 
for maintaining fork join status, are kept.  When a fork is encountered during 
traversal, it is added to the forkjoin stack and number of paths associated 
with the fork is also stored.  When a node’s child is seen as a join, the join 
is added to the forkJoin stack and the no. of paths to it is updated. When the 
number of paths for fork and join are equal, then the fork/join pair is removed 
from the forkJoin stack and join is pushed to the dfsStack.

Nodes other than fork and join are only pushed to the dfs stack.
If a action node is seen, only the node's "ok-to" transition is considered


While(!stack.isEmpty()){
        Node n = DfsStack.pop()
        n.traversed =  true;
                If(n.type==fork){
                        ForkJoinStack.push(new Element(n, n.paths) );
                }
                List<Node> childs = getUnvisitedChildnodes(n)   
                For(Node n: childs){
                        If (n.type==join){
                        Boolean b=isForkJoinCleared(ForkJoinStack)      
                        If(!b){
                                Continue;
                        }
                        stack.push(n);
                        n.traversed =  true;
                }                               
}


This addresses bug OOZIE-636.
    https://issues.apache.org/jira/browse/OOZIE-636


Diffs (updated)
-----

  trunk/core/src/main/java/org/apache/oozie/ErrorCode.java 1235973 
  
trunk/core/src/main/java/org/apache/oozie/workflow/lite/LiteWorkflowAppParser.java
 1235973 
  
trunk/core/src/test/java/org/apache/oozie/service/TestLiteWorkflowAppService.java
 1235973 
  
trunk/core/src/test/java/org/apache/oozie/workflow/lite/TestLiteWorkflowAppParser.java
 1235973 
  trunk/core/src/test/resources/wf-schema-valid.xml 1235973 

Diff: https://reviews.apache.org/r/3486/diff


Testing
-------

Test case to validate fork-join added


Thanks,

Virag


                
> Check fork and join in the workflow in the submission time 
> -----------------------------------------------------------
>
>                 Key: OOZIE-636
>                 URL: https://issues.apache.org/jira/browse/OOZIE-636
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Virag Kothari
>
> Enhancement: Oozie should check that the fork node and join node are correct 
> in pair when user submits the job. This should be a static check, not when 
> the workflow is running.
> Current logic bug:
> A workflow with different number of forks and joins was run. The wf job 
> should have been killed but it succeeded. Also, strangely, the action was 
> killed. 
> Following are the different types of tests run and their results with varying 
> delays.
> test1: wf job SUCCEEDED, action java12 KILLED.
> delay11=11
> delay12=12
> delay121=1
> delay122=2
> delay21=1
> delay22=1
> test2: wf job SUCCEEDED, action java12 KILLED. 
> delay11=1
> delay12=12
> delay121=1
> delay122=2
> delay21=1
> delay22=1
> test3: wf job SUCCEEED, all actions OK. question: why wf job always pass in 
> this scenario, even when fork-join not in
> pair?
> delay11=10
> delay12=10
> delay121=15
> delay122=15
> delay21=20
> delay22=20
> workflow.xml
> ============
> <workflow-app xmlns='uri:oozie:workflow:0.1' name='fork-join-4735180-wf'>
>     <start to='fork1' />
>     <fork name="fork1">
>         <path start="java11" />
>         <path start="fork12" />
>     </fork>
>     <action name='java11'>
>         <java>
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <main-class>qa.test.tests.testsleep</main-class>
>             <arg>${delay11}</arg>
>         </java>
>         <ok to="java12" />
>         <error to="fail" />
>     </action>
>     <action name='java12'>
>         <java>
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <main-class>qa.test.tests.testsleep</main-class>
>             <arg>${delay12}</arg>
>         </java>
>         <ok to="join1" />
>         <error to="fail" />
>     </action>
>     <fork name="fork12">
>         <path start="java121" />
>         <path start="java122" />
>     </fork>
>     <action name='java121'>
>         <java>
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <main-class>qa.test.tests.testsleep</main-class>
>             <arg>${delay121}</arg>
>         </java>
>         <ok to="join12" />
>         <error to="fail" />
>     </action>
>     <action name='java122'>
>         <java>
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <main-class>qa.test.tests.testsleep</main-class>
>             <arg>${delay122}</arg>
>         </java>
>         <ok to="join12" />
>         <error to="fail" />
>     </action>
>     <join name="join12" to="fork2" />
>     <fork name="fork2">
>         <path start="java21" />
>         <path start="java22" />
>     </fork>
>     <action name='java21'>
>         <java>
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <main-class>qa.test.tests.testsleep</main-class>
>             <arg>${delay21}</arg>
>         </java>
>         <ok to="join1" />
>         <error to="fail" />
>     </action>
>     <action name='java22'>
>         <java>
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <main-class>qa.test.tests.testsleep</main-class>
>             <arg>${delay22}</arg>
>         </java>
>         <ok to="join1" />
>         <error to="fail" />
>     </action>
>     <join name="join1" to="end" />
>     <kill name="fail">
>         <message>Streaming Map/Reduce failed, error
> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>     </kill>
>     <end name='end' />
> </workflow-app>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to