[
https://issues.apache.org/jira/browse/OOZIE-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459708#comment-13459708
]
Robert Kanter commented on OOZIE-994:
-------------------------------------
I believe the START_MANUAL/START_RETRY statuses are for when the action is
being started, right? (i.e. ActionStartXCommand). At this point, the action
has already started running before it gets the error and would behave more like
when you suspend a workflow (in which case, the action's status is still
RUNNING).
> ActionCheckXCommand does not handle failures properly
> -----------------------------------------------------
>
> Key: OOZIE-994
> URL: https://issues.apache.org/jira/browse/OOZIE-994
> Project: Oozie
> Issue Type: Bug
> Components: workflow
> Affects Versions: 3.2.0
> Reporter: Alejandro Abdelnur
> Assignee: Robert Kanter
> Priority: Critical
> Fix For: trunk
>
> Attachments: OOZIE-994.patch
>
>
> If the JT restarts or dies and running jobs are lost or the JT is not
> reachable, Oozie ActionCheckXCommand will never fail the workflow job.
> There seem to be 2 issues here:
> * convertException is not receiving the root cause exception anytmore, but
> alway HadoopAccessorException wrapping the root cause exception. We should
> modify the convertException to inspect the cause exception as well.
> * ActionCheckXCommand does not do the handle retry logic of
> ActionStartXCommand.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira