[
https://issues.apache.org/jira/browse/OOZIE-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Kanter updated OOZIE-1319:
---------------------------------
Attachment: OOZIE-1319.patch
I first tried to re-use the TIMEDOUT status and logic, but that didn't work for
a few reasons, including the fact that TIMEDOUT causes the coordinator job to
transition to DONEWITHERROR and not SUCCEEDED.
So I ended up mirroring the TIMEDOUT logic to some extent for the new SKIPPED
status. This ended up being much simpler than the old approach. I kept some
of the old approach's logic, such as when materializing actions in
CoordMaterializeTransitionXCommand, if we're in the past and LAST_ONLY, then it
needs to materialize all actions until "now". The new logic is in
CoordActionInputCheckXCommand where we check if an action's dependencies are
met (this also gets called for actions without dependencies); the TIMEDOUT
logic is here too. I added some code that will set the action's status to
SKIPPED if the current time is later than the next action's nominal time; this
way, there is only ever one WAITING action whose time dependency has been met.
The documentation in the patch gives a more user-oriented explanation that
might be more clear.
This is a little more restrictive than the old approach would have been had it
worked properly. In affect, the new approach doesn't allow two WAITING actions
whose time dependencies are met but data dependencies are not met could become
SUBMITTED; while the old approach would allow this. However, I think this
restriction make sense because with LAST_ONLY you're only interested in the
last action anyway; older actions should be SKIPPED, even if the last action's
(data) dependencies haven't been met yet.
> "LAST_ONLY" in execution control for coordinator job still runs all the
> actions
> -------------------------------------------------------------------------------
>
> Key: OOZIE-1319
> URL: https://issues.apache.org/jira/browse/OOZIE-1319
> Project: Oozie
> Issue Type: Bug
> Reporter: Bowen Zhang
> Assignee: Robert Kanter
> Attachments: OOZIE-1319.patch, OOZIE-1319.patch, OOZIE-1319.patch,
> OOZIE-1319.patch, OOZIE-1319.patch, oozie-1319.patch
>
>
> In execute() of CoordJobGetReadyActionsJPAExecutor.java, once we retrieve the
> top item from a "LIFO" query result, we do not discard or delete the
> remaining items from the result list. As a result, the next time execute() is
> invoked, we will be retrieving the next item in line. Consequently, LAST_ONLY
> strategy will also execute all ready actions for a given coordinator job,
> making it no different than LIFO.
--
This message was sent by Atlassian JIRA
(v6.2#6252)