[ 
https://issues.apache.org/jira/browse/OOZIE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated OOZIE-2090:
---------------------------------
    Attachment: OOZIE-2090.patch

The way wf:lastErrorNode() works is that {{DagELFunctions.setActionInfo(...)}} 
checks if the action's status is ERROR and sets a variable to the node's name.  
The problem is that {{ActionEndXCommand}} temporarily changes the action's 
status to ERROR in this case; it gets set back to END_RETRY if there's a retry; 
but the {{DagELFunctions.setActionInfo(...)}} call happens first, so it thinks 
the action failed always.  
The patch simply moves the call to later in {{ActionEndXCommand}} so that the 
action's status can "stabilize" before setting, or not setting, it as the last 
error node.  

> wf:lastErrorNode does not take into account transient errors with retries
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-2090
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2090
>             Project: Oozie
>          Issue Type: Bug
>    Affects Versions: 4.1.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: OOZIE-2090.patch
>
>
> Suppose you have a workflow where an action fails on the first try, but the 
> automatic retry behavior for transient failures kicks in, and it succeeds on 
> one of the later tries.  Currently, the wf:lastErrorNode() EL Function will 
> show that this node failed, even though it ultimately succeeded.
> We should have wf:lastErrorNode() take into account the auto-retries by not 
> setting the last error node until the last retry occurs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to