[ 
https://issues.apache.org/jira/browse/OOZIE-865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413182#comment-13413182
 ] 

Robert Kanter commented on OOZIE-865:
-------------------------------------

In your first example, the possible execution paths are:
{code}
fork -> action1 -> join -> action3
     -> action2 -> 
{code}
and
{code}
fork -> action1 -> join -> action3
     ->         ->
{code}

But in the second example, they are:
{code}
fork -> action1 -> join -> action3
     -> action2 -> 
{code}
and
{code}
action3
{code}

They aren't entirely equivalent.  To fix the second example, the decision node 
should be "decision -> action1, fork", but I'm not sure that is allowed (or if 
it is, if it should actually be allowed) because if the decision node decides 
to take action1 and not fork, then action1 will go to join without having been 
forked first.  I think to make an equivalent example, we'd have to have 
"decision -> action4, fork" where action4 is exactly the same as action1 except 
that it goes to action3 instead of join.  But having duplicate nodes isn't good 
either.  

Even if there is a way to re-write every possible workflow to an equivalent one 
without a decision going to a join, it seems to me that we should try to make 
it as flexible as possible.  Is there a specific reason why we shouldn't allow 
this?  

Besides, you can easily emulate this by having the decision node go to a no-op 
action that then goes to the join.  So, people determined to write a workflow 
with a decision node that goes to a join would still be able to do so anyway.  
                
> ForkJoin validator checks total lengths of forks vs. joins instead of actual 
> paths
> ----------------------------------------------------------------------------------
>
>                 Key: OOZIE-865
>                 URL: https://issues.apache.org/jira/browse/OOZIE-865
>             Project: Oozie
>          Issue Type: Bug
>    Affects Versions: 3.2.0
>            Reporter: Harsh J
>         Attachments: workflow.xml
>
>
> Consider a WF that has four fork paths, each to a decision node, and each of 
> these eventually in their further paths end at a single join node (thereby 
> resulting in a valid DAG).
> When such a WF is passed to Oozie and fork join validator is enabled, the 
> validation fails cause the numForks(4) > numJoins(1). This naive way appears 
> to be wrong to compare, and we should ideally only compare true path based 
> forks->joins lists, if possible.
> This causes a regression if the fork join validation is left enabled. 
> Workaround for such workflows currently is to disable fork join validation 
> via {{oozie.validate.ForkJoin}} set to {{false}} at the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to