Robert Kanter created OOZIE-2397:
------------------------------------

             Summary: LAST_ONLY and NONE don't properly handle READY actions
                 Key: OOZIE-2397
                 URL: https://issues.apache.org/jira/browse/OOZIE-2397
             Project: Oozie
          Issue Type: Bug
          Components: core
    Affects Versions: 4.2.0
            Reporter: Robert Kanter
            Assignee: Robert Kanter
            Priority: Critical
             Fix For: trunk


When using LAST_ONLY or NONE, actions are supposed to be able to transition 
from READY to SKIPPED if the right criteria are met, but they don't. This is in 
contrast to the timeout feature, which does not.

Here's a more detailed technical description of the problem:
We handle LAST_ONLY in 
[CoordMaterializeTransitionXCommand|http://github.mtv.cloudera.com/CDH/oozie/blob/cdh5-4.1.0_5.5.0/core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java#L242]
 and 
[CoordActionInputCheckXCommand|http://github.mtv.cloudera.com/CDH/oozie/blob/cdh5-4.1.0_5.5.0/core/src/main/java/org/apache/oozie/command/coord/CoordActionInputCheckXCommand.java#L156].
  The former deals with materializing the actions and the behavior to set "old" 
actions to SKIPPED when materializing them.  The latter deals with checking the 
input datasets for actions and the behavior to determine if a WAITING action is 
ready to transition to READY (deps are met) and all that entails, including 
changing status to READY and queuing a CoordActionReadyXCommand.  If the deps 
are not met and the dataset is not there yet, it will queue itself at some 
delay.  So, these only handle the materialization and WAITING states.  However, 
LAST_ONLY is supposed to also do READY --> SKIPPED if it's condition is met 
(unlike TIMEDOUT, which can only come from WAITING; *this additional difference 
should probably be called out in the docs*).  

[CoordActionReadyXCommand|http://github.mtv.cloudera.com/CDH/oozie/blob/cdh5-4.1.0_5.5.0/core/src/main/java/org/apache/oozie/command/coord/CoordActionReadyXCommand.java#L103]
 needs to be updated to handle LAST_ONLY.  It currently treats LAST_ONLY the 
same as LIFO (via CoordJobGetReadyActionsJPAExecutor), where the order is the 
only difference from FIFO.  After retrieving all READY actions, it should check 
if any meet their LAST_ONLY condition, and if so, queue a 
CoordActionSkipXCommand for them (maybe make a bulk version?) instead of a 
CoordActionStartXCommand.

We have the same issue with NONE, which has similar behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to