[
https://issues.apache.org/jira/browse/OOZIE-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
abhishek bafna updated OOZIE-2476:
----------------------------------
Fix Version/s: (was: trunk)
4.3.0
> When one of the action from fork fails with transient error, WF never joins.
> ----------------------------------------------------------------------------
>
> Key: OOZIE-2476
> URL: https://issues.apache.org/jira/browse/OOZIE-2476
> Project: Oozie
> Issue Type: Bug
> Reporter: Purshotam Shah
> Assignee: Purshotam Shah
> Fix For: 4.3.0
>
> Attachments: OOZIE-2476-V1.patch
>
>
> Noticed multiple time in our production.
> If one the action in fork fail with a transient error ( but succeeded after
> few retries), they never join.
> This happens when on the action is fork fails to submit a job.
> Oozie queues command as queue(this, retryDelayMillis) on transient error.
> ActionStartXCommand doesn't load job if its is not null.
> Before ActionStartXCommand runs again, other actions have already started
> which has modified job info. ActionStartXCommand still contains old info,
> which writes to DB and we miss some workflow instance data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)